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Abstract 



In this thesis we consider the classical capacity of certain quantum channels, that is, 
the maximum rate at which classical information, encoded as quantum states, can 
be transmitted reliably over a quantum channel. 

We first concentrate on the product-state capacity of a particular quantum chan- 
nel, that is, the capacity which is achieved by encoding the output states from a 
source into codewords comprising of states taken from ensembles of non-entangled 
(i.e. separable) states and sending them over copies of the quantum channel. Us- 
ing the "single-letter" formula proved by Holevo 0J and Schumacher and West- 
moreland we obtain the product- state capacity of the qubit quantum amplitude- 
damping channel, which is determined by a transcendental equation in a single real 
variable and can be solved numerically. We demonstrate that the product-state ca- 
pacity of this channel can be achieved using a minimal ensemble of non- orthogonal 
pure states. We also consider the generalised amplitude-damping channel and show 
that the technique used to calculate the product-state capacity for the "traditional" 
amplitude damping channel also holds for this channel. 

In the following chapter we consider the classical capacity of two quantum chan- 



nels with memory namely, a periodic channel with quantum depolarising channel 
branches and a convex combination of quantum channels. The classical capacity 
is defined as the limit of the capacity of a channel, using a block of states which 
are permitted to be entangled over n channel uses and divided by n, as n tends to 
infinity. 

We prove that the classical capacity for each of the classical memory channels men- 
tioned above is, in fact, equal to the respective product-state capacities. For those 
channels this means that the classical capacity is achieved without the use of en- 
tangled input-states. We also demonstrate that the method used in the proof of the 
classical capacity of a periodic channel with depolarising channels does not hold 
for a periodic channel with amplitude-damping channel branches. This is due to 
the fact that, unlike the depolarising channel, the maximising ensemble for a qubit 
amplitude-damping channel is not the same for all amplitude-damping channels. 

We also investigate the product-state capacity of a convex combination of two mera- 
oryless channels, which was shown in [3] to be given by the supremum of the min- 
imum of the corresponding Holevo quantities, and we show in particular that the 
product- state capacity of a convex combination of a depolarising and an amplitude- 
damping channel, is not equal to the minimum of their product-state capacities. 

Next we introduce the channel coding theorem for memoryless quantum channels, 
providing a known proof [4] for the strong converse of the theorem. We then con- 
sider the strong converse to the channel coding theorem for a periodic quantum 
channel. 
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Notation 



Symbol 


Interpretation 


A* 


Conjugate transpose of A 


Tr(A) 


Trace of operator A 


IWIi 


Trace-norm of operator A, given by Tr\/A* A 


log 


Binary logarithm 


In 


Natural logarithm 


H(p) 


Shannon entropy of probability distribution p 


M 


Complex vector 


P 


Quantum state (density operator) 


S(P) 


von Neumann entropy of state p 


M 


Classical channel 


$ 


Quantum channel 


B(H) 


Set of quantum states on Hilbert space % 


x({Pj,Pj}) 


Holevo quantity with respect to the ensemble {pj, Pj} 


x*($) 


Product- state capacity of $, given by max^.^.} x{{Pji ®(Pj)}) 
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Chapter 1 



Introduction 



Informally, the capacity of a channel can be considered to be a measure of the 
channels usefulness for sending information faithfully from source (or sender) to 
receiver. The capacity of a quantum channel, can be thought of as a measure of 
the closeness of that channel to the quantum identity channel, which itself sends 
quantum information with perfect fidelity. Throughout the thesis we concentrate on 
the case where classical messages (or output from a classical information source) 
are encoded into quantum states and sent over quantum channels. 

Classical information can be encoded into different types of quantum states, i.e. or- 
thogonal or non-orthogonal, separable or entangled. Note that the latter is a purely 
quantum mechanical phenomenon. We are interested in the type of states and en- 
sembles which achieve the capacity of certain quantum channels and pay particular 
attention to the capacity of noisy quantum channels with memory, i.e. channels 
which have correlations between successive uses. 

1 



We first introduce the concept of classical information entropy and classical channel 
capacity (see [5] and (6]|, for example). We do so because a great deal of what has 
been achieved in the field of quantum information theory to has date been inspired 
by results in classical information theory, most notably Claude Shannon's semi- 
nal article on classical channel capacity, published in 1948. The brief review 
of Shannon's work on channel capacities is also justified in order to demonstrate 
that not all of his results on classical channels have been successfully generalised 
to the quantum setting. Unlike classical channels, there are a number of different 
types of quantum channel capacities, namely, the classical capacity, quantum ca- 
pacity and the private capacity. These capacities have not yet been fully resolved. 
Moreover, entangled input states, mentioned above, have recently been shown to 
improve the classical capacity of quantum channels. Hastings [8], building on a re- 
sult by Hay den and Winter ||5), recently presented a violation of one of the longest 
standing conjectures in quantum information theory, namely the additivity conjec- 
ture involving the Holevo quantity [[Q|2]|. This counterexample implies that the con- 
jectured formula for the classical capacity of a quantum channel is disproved and 
that a simple "single-letter" formula for the capacity remains to be discovered. The 
classical capacity of a quantum channel can therefore only be determined asymptot- 
ically. The question of whether this is an intrinsic property of the classical capacity 
of a quantum channel or whether there is some missing element which has not yet 
been understood remains open. 



Smith and Yard 11101 also recently proved the non- additivity of the quantum channel 
capacity, disproving the operational interpretation of the additivity the of quantum 
capacity of quantum channels, by showing that two channels, each with zero quan- 
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turn capacity, when used together can give rise to a non-zero capacity. This is known 
as "superactivation" of channel capacity. Cubitt, Chen and Harrow IfTTTl have also 
demonstrated a similar result for the zero-error classical capacity of a quantum 
channel. Smith and Smolin lfT2ll and Li, Winter, Zou and Guo lfT3ll have proved the 
non-additivity of the private capacity for a family of quantum channels. 



1.1 Classical information theory 

In information theory, entropy measures the amount of uncertainty in the state of 
a system before measurement. Shannon entropy measures the entropy of a vari- 
able associated with a classical probability distribution. More formally, the Shan- 
non entropy of a random variable X with probability distribution p(x) is given by 
H(X) = — J2 x p(x) log p(x). As entropy is measured in bits, log is taken to the 
base 2, and log(O) = 0. 

Mutual information, H(X : Y), measures the amount of information two random 
variables X and Y have in common, 

H(X:Y) = H(X) + H(Y)-H(X,Y), (1.1.1) 

where, H(X, Y) is the joint entropy Ifl4l . 

Let X and O be the respective input and output alphabets for a classical channel and 
let X n and Y n both be sequences of random variables such that x E I and y E O. 
A channel can be described in terms of the conditional probabilities p (y\x) i.e. the 
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probabilities of obtaining different outcomes, y, given the input variable x. 

1.1.1 Shannon's noisy channel coding theorem 

The capacity of a classical channel J\f provides a limit on the number of classical bits 
which can be transmitted reliably per channel use. The direct part of the classical 
channel coding theorem [Q states that using n copies of the channel, M bits of 
information can be sent reliably over the channel at a rate R = — if and only if 
R < C in the asymptotic limit. 

The strong converse of the channel coding theorem states that if the rate at which 
classical information is transmitted over a classical channel exceeds the capacity 
of the channel, i.e. if R > C, then the probability of decoding the information 
correctly goes to zero in the number of channel uses. 

The capacity of a noisy classical channel, J\f, is given by the maximum of the mutual 
information obtained over all possible input distributions, p(x), for X n , 



C{Af) = max H(X : Y), (1.1.2) 

p{x) 



where H(X : Y) is given by Equation [TTTTTJ 

The first proof of Shannon's coding theorem is due to Feinstein lfT5l . 
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1.2 Quantum information theory 

Quantum communication promises to allow unconditionally secure communication 
IfToll . Techniques to protect quantum information from noise are therefore of great 
importance. A simple "single-letter" formula which could be used to calculate the 
classical capacity of a quantum channel, would lead to a better understanding of 
optimal encodings used to protect quantum information from errors. Whether such 
a formula can be found remains an open question. 



The capacities of quantum channel with memory, widely considered to be more 
realistic than memory less channels, are being explored lfT7l - fT9l . 



The quantum analogue of Shannon entropy is von Neumann entropy. It is defined as 
follows. The entropy of a quantum state, p, is given by the von Neumann entropy, 
S(p) = — Tr (p log p). If p has eigenvalues Aj, then S(p) = — ^\ Aj log(Aj). 

1.2.1 Noisy quantum channel coding 



Figure [TTI depicts a quantum information transmission process from source to re- 
ceiver Il20ll . 

Source Encoding Input Channel Output Decoding Receiver 
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Figure 1.1: Transmitting classical information over a single quantum channel. 
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The sender encodes their message into a block of quantum states. This codeword 



can then be transmitted over copies of a quantum channel, $, see Figure [L21 
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Figure 1.2: Transmitting classical information over copies of a quantum channel. 

The above encodings C n and E n will be important in a later chapter when we in- 
vestigate the channel coding theorem for quantum channels. 



1.3 Thesis layout 

In Chapter 2 we introduce some mathematical preliminaries and discuss concepts 
fundamental to the understanding of quantum information theory. 



We obtain a maximiser for the quantum mutual information for classical informa- 
tion sent over the qubit amplitude-damping and depolarising channels in Chapter 3. 
This is achieved by limiting the ensemble of input states to antipodal states, in the 
calculation of the product state capacity for the channels. In Section I3TT1 we evalu- 
ate the capacity of the amplitude-damping channel and plot a graph of this capacity 
versus the damping parameter. We discuss the "generalised" amplitude damping 
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channel in Section 13.21 and show that the approach taken to calculate the product 
state capacity of the conventional amplitude damping channel can also be taken for 
this channel. We introduce the depolarising channel in Section [3731 and discuss the 
maximising ensemble of the corresponding Holevo quantity. The contents of this 
chapter have been published by T.C. Dorlas and the Author in ETTl . 



Chapter 4 is based on an article published by Dorlas together with the Author 11191 . 
Here we investigate the classical capacity of two quantum channels with memory, 
that is, a periodic channel with depolarising channel branches, and a convex com- 
bination of depolarising channels. We prove that the capacity is additive in both 
cases. As a result, the channel capacity is achieved without the use of entangled 
input states. In the case of a convex combination of depolarising channels the proof 
provided can be extended to other quantum channels whose classical capacity has 
been proved to be additive in the memoryless case. 



In Section 14.31 we introduce the periodic channel and investigate the product-state 
capacity of the channel with depolarising channel branches. We derive a result 
based on the invariance of the maximising ensemble of the depolarising channel, 
which enables us to prove that the capacity of such a periodic channel is additive. 



In Section 14741 the additivity of the classical capacity of a convex combination of 
depolarising channels is proved. This is done independently of the result derived in 



Section 14731 and can therefore be generalised to a class of other quantum channels. 

In Section 14.51 we state the theorem proved by Datta and Dorlas in Q concern- 
ing the product-state capacity of a convex combination of memoryless channels 
and we show that in the case of two (or more) depolarising channels or two (or 
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more) amplitude-damping channels, this is in fact equal to the minimum of the in- 
dividual capacities. We show however in the case of a depolarising channel and an 
amplitude-damping channel, that this is not the case. 

The channel coding theorem and strong converse is discussed in Chapter 5 and we 
provide the proof by Winter @J. We then consider the strong converse for a periodic 
quantum channel, in light of a result shown in Section H~6l for the periodic channel 
with amplitude damping channel branches. 

Appendix IA. 1 1 states Caratheodory's Theorem which is used in Chapter 3. A proof 
provided by N. Datta, which states that it is sufficient to consider ensembles con- 
sisting of at most d 2 pure states in the maximisation of the Holevo quantity for a 
CPT map, is given in Appendix IA.21 

The proof of the product- state capacity of a periodic quantum channel, provided 
by Datta and Dorlas, is given in Appendix lB.il The periodic channel is a special 
case of a channel with arbitrary Markovian noise correlations. The proof of the 
formula for the product-state capacity of such a channel, i.e. one with noise given 
by arbitrary Markovian correlations, is given by Datta and Dorlas in Q. 



Chapter 2 



Preliminaries 



We begin by establishing some concepts fundamental to the study of quantum in- 
formation theory. We build on the framework of quantum information transmission 
introduced in the previous chapter by making these ideas mathematically concrete. 
We introduce quantum states and channels and describe the operator sum represen- 
tation, a tool widely used in quantum information theory to describe the behavior 
of an input state with a given quantum channel. The definition of quantum en- 
tanglement is provided and we discuss the mutual information between different 
quantum states and the Holevo-Schumacher- Westmoreland theorem, which is used 
to determine the product-state capacity for classical information sent over quantum 
channels. The theory of Markov processes is introduced, providing definitions nec- 
essary for Chapter 4, where we introduce two channels which have memory that 
can be described using Markov chains. 



2.1. QUANTUM STATES AND QUANTUM CHANNELS 

2.1 Quantum states and quantum channels 

We now provide definitions for quantum states and quantum channels. 

2.1.1 Quantum state 

A quantum state is given by a positive semi-definite Hermitian operator of unit trace 
on a Hilbert space. We now define the terms used in this definition. 

A Hilbert space % is a complex vector space equipped with an inner product. Note 
that we will only consider finite dimensional Hilbert spaces. An element of a Hilbert 
space, known as a vector, is denoted \v). An element of the dual space %*, the 
conjugate transpose of \v) E H, is denoted (v\, where 



\v) 



<^ 



\ Vd / 



en, (v\ = (v 1 ---v d )eH*, 



(2.1.1) 



and v is the complex conjugate of v . In fact, due to the correspondence between 
% and its dual space %*, given by the inner product, we can consider each element 
\v) E H as an element of H*. 

The norm of the vector \u) E % is defined as follows, 



u 



\/{u\u). 



(2.1.2) 
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2.1. QUANTUM STATES AND QUANTUM CHANNELS 

Note that positive operators are a subclass of Hermitian operators, both are defined 
below. An operator A E B(H) is Hermitian HA* = A where A* is the adjoint of 
the operator A, defined by (Au\v) = (u\A*v), for all vectors \u) and \v) in the state 
space of A. 

An operator A E B(H) is called positive (semi-definite), if for all vectors \v) ^ E 
"H, the following holds (v\Av) > 0. 

Since a density operator is defined to be a positive (Hermitian) operator with trace 
one on a Hilbert space "H a quantum state can be represented by a density operator. 
We now define a quantum state to be a positive operator of unit trace p E B{T-L), 
where B{T-L) denotes the algebra of linear operators acting on a finite dimensional 
Hilbert space "H. 

2.1.1.1 Pure and mixed states 

According to the first postulate of quantum mechanics, a quantum system is com- 
pletely described by its state vector, \ip). A state vector is a unit vector in the 
state space of the system. A system whose state is completely known is said to 
be in a pure state. The density operator for that system is given by the projec- 
tion p = \ip){tp\. If, however a system is in one of a number of states, then the 
system is said to be in a mixed state. If a system is in one of the states \ipi) with 
respective probabilities Pi, then {p^, \ipi)} is called an ensemble of pure states. The 
corresponding density operator is given by p = J2i=i Pil^i) (V^ I • 
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2.1.1.2 Composite quantum systems 

A composite of two quantum systems % and /C can be described by the tensor 
product of the two Hilbert spaces %®K. Note that dim %®K, — dim"H x dim/C. 
The state of one of the Hilbert spaces can be extracted from the product state of the 
two Hilbert spaces by performing the partial trace on the composite system. 

The tensor product and partial trace are both defined in the following section. 

2.1.1.3 The tensor product and partial trace 

The tensor product of two vectors, \h) E % and \k) G /C, is defined as 



\h) <g> \k) = ^^XiVjlhi) <g> \kj), 



(2.1.3) 



* j 



where V, and /C represent Hilbert spaces with respective bases {hi} and {kj}. 

The following demonstrates how the tensor product of two vectors and two matrices 
is computed, respectively 



/ \ 

a 




t ac\ 



ad 

be 

\ bd I 



(2.1.4) 
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a b 
c d 



\ ( A 



/ 



V^ h J 



ae af be bf ' 

ag ah bg bh 

ce cf de df 

i eg eh dg dh , 



Properties of the tensor product include, 



(2.1.5) 



($ <g> W)(p <g> p') = $(p) <g> tf(p') 



(2.1.6) 



The trace of an operator A E B(H), with the orthonormal basis \4>i), is given by 



dim(-H) 



(2.1.7) 



i=\ 



We now introduce partial trace. Let % and /C represent two Hilbert spaces with or- 



, dimfH) 



dim(/C) 



thonormal bases {\hi)} i=1 and {\kj)}j =1 , respectively. Letpbe a state defined 
on the composite system such that p E B(H ® /C). The state of the subsystem "H is 
given by the reduced density operator p n and is defined by 



($|pw|tf) = ^($ ® fcj|p|* ® fej), 



(2.1.8) 



where |<3>) and |\l/) are states on %. Inserting the basis elements for $ and ty, the 
partial trace can be calculated as follows 

p n = Tr^(p) = ^2^2(hi <S> kj\p\h m <g> kj) \hi){h m \, (2.1.9) 

l,m j 

where Tr^ is the partial trace operation from B(H <8> /C) onto 13(H). The state of 
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the subsystem /C is similarly defined. 



2.2 Quantum channel 

A map $ : B(H) — > B(IC) is said to be completely positive if 

($ <g> Z) (4) > 0, (2.2.1) 

where A > 0, is an operator defined on the Hilbert space H ® H, where "H is 
an arbitrary space, /C is the Hilbert space of the output state and X is the identity 
operator. 

A quantum channel is defined as a completely positive, trace preserving map, which 
maps density operators from one Hilbert space to another. 

In general, when a pure input state is transmitted through a noisy quantum channel, 
the output state is not known with absolute certainty i.e. it is no longer a pure state. 
The corresponding state is said to be mixed. 

The initial state to a channel is given by the tensor product of the information state, 
p, defined on the Hilbert space "H and the initial state of the environment p env = 
\ipo) {ipo\, assumed to be in a pure state and defined on the Hilbert space H env . 

Remark 1. It may be assumed that the initial state of a system is in a pure state as 
the state of the system can always be defined in terms of a larger composite system 
which can be chosen to be in a pure state. This is known as state purification. 

14 
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During transmission over a channel, the composite state, p <8> p e nv will evolve uni- 
tarily such that U (p® p env ) U*, for the unitary operator U on H £S> V-env 

After the interaction of the channel with the state p <g> p e m» the output state p out G 
£>("H) is given by 

Pom = Tr w _ [U (p <g> p em3 ) C/*] . (2.2.2) 

This corresponds to a measurement operator on the information state alone after it 
has evolved in interaction with the environment. 

Note that, a unitary operator is defined as U*U = I. A unital channel is a channel 
where the following identity holds, 

$(/) = /. (2.2.3) 

Next we introduce operator sum representation, a way of describing the action of a 
quantum channel on an input state. 

2.2.1 Operator sum representation 

Quantum channels can be represented using operator-sum, or Kraus representation. 
By tracing over the state space of the environment, the dynamics of the principal 
system alone are extracted and represented explicitly. We will now show that this 
representation is a re-statement of equation (|2.2.2I) . 

Let {i/jk}k=i denote an orthonormal basis for the Hilbert space of the envi- 

ronment Henv and recall the definition for the partial trace of an operator given by 

15 
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Equation (12.1.91) . Equation (|2.2.2I) now becomes, 

dimCHenv) 

$ ^ = J2 ^(hi®ipk\U(f®\'fo)('il)o\)U*\hi,®il>k)\hi)(hi f \ 
fc=l J,/' 

dim(Henv) 

fe=i 

where ^ = ^2w(hi ® V'fcl^l^-Z' ® V'o}|^/}(^/'| 5 {^};=i an orthonormal basis 
for "H and p e B(H). The operators {i£fc} fe=1 G i3("H) are known as operation 
elements. 

The operator $(p) represents the output state, and therefore must satisfy a com- 
pleteness relation such that Tr $(p) = 1. Using the operation elements defined 
above, 

(dim(W) \ 

J2 E *P E t ■ < 2 - 2 - 5 ) 

Using the cyclic property of trace, 

(dimfH) \ /dim(H) \ 

J2 E k pE* k = Tr J2 E *k E *P = L ( 2 - 2 - 6 ) 

since this must hold /or a// p, it follows that J^ fc E* k E k = I. 

A map $ : £>(%) — >■ i3(/C) is completely positive and trace preserving if it admits a 
Kraus Representation G2l 

$(p) = £ £ lP £*, ^ £*£, = /. (2.2.7) 
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2.3. POSITIVE OPERATOR- VALUED MEASURE 



A memory less channel is given by a completely positive map $ : B(7i) — > B()C), 
where B(H) and B(JC) denote the states on the input and output Hilbert spaces H 
and /C. 



2.3 Positive operator- valued measure 

Measurement of a quantum system can be described by a set of Hermitian matrices, 
{E k }, satisfying E k > and J2 k E k = I ( 02I231). The set {E k } is called a 
positive operator-valued measure (POVM). 

If measurement, described by the set {E k }, is performed on a system in a state p, 
then the probability of obtaining outcome label k is given by Ti(pE k ). 



2.4 Quantum entanglement 

A state p E B(H ® /C) is said to be separable if it can be written as a probabilistic 
mixture of product states 



P 



^2pi\hi)(hi\®\ki)(ki\, (2.4.1) 



where |/i;) G "H, \ki) G /C, J2 { Pi = 1, and p { , > 0. Otherwise the state is said to be 
an entangled state. 

Entanglement is an important resource in quantum information processing and plays 
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an essential role in quantum teleportation, quantum cryptography, quantum compu- 
tation, quantum error correction [24] . 



2.5 Classical information over a quantum channel 

The transmission of classical information over a quantum channel is achieved by 
encoding the information into quantum states. To accomplish this, a set of possible 
input states pj E B{%) with probabilities pj are prepared, describing the ensemble 
{pj, pj}. The average input state to the channel is expressed as p = V PjPj. The 
average output state is p = J2j Pj^(Pj) tl25j 

2.5.1 Holevo bound 

When a state is sent through a noisy quantum channel, the amount of information 
about the input state that can be inferred from the output state is called the accessible 
information. For any ensemble {pj, Pj}, the Holevo quantity is defined as 

X ({Pj, Pj }) := S ( J^Pj Pj J ~ Yl p i S (Pi)' (2 - 5A) 

The Holevo bound Il26ll provides an upper bound on the accessible information and 
is given by, 

H(X:Y)< x{{PiMPi)Y), (2-5.2) 
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where x{{Pj>®{Pj) }) is the Holevo quantity of channel $. Here X is the random 
variable representing the classical input to the channel. The possible values Xj are 
mapped to states pj which are transformed to $(pj) by the channel. Then, a gen- 
eralised measurement with corresponding POVM {Ej} allows the determination of 
the output random variable Y with conditional probability distribution given by 

¥(Y = x k | X = Xj ) = Tr($( Pj )E k ). (2.5.3) 

The second term in the Holevo bound is often referred to as the output entropy. This 
term represents the joint entropy of the system % ® "Hem; after evolution and can be 
interpreted as the final entropy of the environment, assuming that the environment 
was initially in a pure state. This is the amount of information that the information 
state, or principal system, has exchanged with the environment. We therefore want 
to minimize the output entropy and maximise the entropy of the expected state. This 
justifies the definition of the capacity of the channel as the maximum of the mutual 
information. When quantum information is sent down a noisy quantum channel, the 
output entropy is known as entropy exchange [|27l . 



Holevo 11261 has introduced a measure of the amount of classical information re- 
maining in a state that has been sent over a noisy quantum channel. The product- 
state capacity of a channel is given by the maximisation of this Holevo quantity over 
an ensemble of input states, and can be interpreted as the amount of information that 
can be sent, in the form of product-states, reliably over the channel. 

In this case the fact that the capacity is given by the maximum of the Holevo quantity 
is known as the Holevo-Schumacher- Westmoreland (HSW) Theorem. 
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2.6 The Holevo-Schumacher- Westmoreland theorem 

If the possible input states to a channel are prepared as product states of the form 
Pi <E> P2 <8> ■ ■ • , then the associated capacity is known as the product state capacity. 
This implies that the input states have not been entangled over multiple uses of the 
channel. The capacity for channels with entangled input states has been studied 
ll28Tl . and it has been shown that for certain channels the use of entangled states can 
enhance the inference of the output state and increase the capacity (e.g. [29]). 



The HSW theorem, proved independently by Holevo [1] and by Schumacher and 
Westmoreland @, provides an expression to calculate the product state capacity 
for classical information sent through a quantum channel, $, and can be calculated 
using the following expression, 



**($)= max xdPjMPj)}) C 2 - 6 - 1 ) 



where S is the von Neumann entropy, S(p) = — Tr (p log p). If p has eigenvalues 
Aj, then S(p) = — ^ K log(Aj). The capacity is given by the maximum mutual 
information calculated over all ensembles {pj,Pj} llT4l . Properties characterising 
optimal input ensembles for have been studied QUI . 

Remark 2. Prior to the HSW theorem, Holevo 071/ developed a formula for cal- 
culating the product-state capacity of a quantum channel where a maximisation of 
the accessible information is taken explicitly over both the input ensemble and over 
product measurements performed on the output of the channel. It has been shown 
that, in certain cases, more information can be transmitted per use of a quantum 
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channel using collective measurements rather than separable ones (see /f71 l32]|33l/ ). 

2.6.1 Optimal input enembles 

By concavity of the entropy, the maximum in Equation 12.6.11 is always attained 
for an ensemble of pure states pj. Indeed, we can decompose each pj as convex 
combinations of pure states: pj = ^ k qk\i ; j,k){' l Pj,k\- This does not change the first 
term of (12.6.11) . but by concavity of the entropy, 

SmPj)) > E^($(fe>(V>i, fe |))- (2-6.2) 



Moreover, it follows from Caratheodory's theorem 0414361 . that the ensemble can 
always be assumed to contain no more than d 2 pure states, where d = dim (H). 

A statement of Caratheodory's Theorem is provided in Appendix IA. 1 1 along with 
a proof by N. Datta in Appendix IA.2I which states that it is sufficient to consider 
ensembles consisting of d 2 pure states in the maximisation of the Holevo quantity 
%($), for some CPT map $. 

Next we introduce two models for quantum memory. 



2.7 Models for quantum memory channels 



Bowen and Mancini [37, 38J introduced two models for quantum channels with 



memory. The model shown in Figure [27TJ depicts an interaction between each mem- 
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ory state pj and its environment Ej. The environments {Ej}' r - =l are correlated, 
which leads to a memory effect at each stage of the evolution. In contrast to the 





















Ei 


! m ! 


£, 


! ,i/ ! 


E 3 




















! • • • 




Pi 




P2 




P3 


















t/i u 2 u 3 



Figure 2. 1 : A model for quantum channel memory: each input state pj interacts with 
its own environment, which is itself correlated with the other environments If39ll. 



previous model, Figure 12.21 depicts the input state pj interacting with its own en- 
vironment and with the memory state. The error operators at each stage of the 
evolution are correlated, and may be determined using the relevant unitary operator 
and the input state. Both process will be described in the following subsections. 



2.7.1 Several uses of a memoryless quantum channel 



Recall that it is known that any quantum channel, described by a completely positive 
trace preserving (CPT) map, can be represented by a unitary operation on the input 
state to the channel and the initial (known) state of the environment ll22l . The output 



22 



2.7. MODELS FOR QUANTUM MEMORY CHANNELS 

























! • • • 




Ei 




E 2 




E 3 




















M 




M 




M 




M 


























Pi 




ft 




P.; 








Ui 




U 2 




u 3 



Figure 2.2: Model for quantum channel memory: correlations between each error 
operator and input state are determined by the relevant unitary operator and the 
memory state ll39ll . 



state following a sequence ofn uses of the memory less channel, $ is given by 



$W ( p W) = Tr E U n>En ■ ■ ■ U 1>El (pW ® |0 El • • ■ 0O(0 Bl ■ ' ■ 0*1*1) 



x 



X ^1,-Bl ' " ' U n ,E n 



(2.7.1) 



where, p^ n > G "H 0n is a (possibly entangled) input state codeword and 

Pe = \^Ei ■ ■ ■ 0£ n )(0 £ ; 1 • ■ ■ En \ represents the (product) state of the environment. 

Note that the trace is taken over each state comprising the state of the environment. 

2.7.2 Several uses of a quantum channel with memory 

The action of a quantum memory channel on a sequence of input states can be 
viewed in the following two ways. 



23 



2.7. MODELS FOR QUANTUM MEMORY CHANNELS 



2.7.2.1 Model 1 



The action of the channel described by Figure [2TT1 can be described as follows 



3<») ( p W) = Tte Un>En . . . UljEi ( p W g, ^flj . . . ^ 



(2.7.2) 



where u E = QePe^e an ^ ^e is a unitary operator on E which introduces corre- 
lations between the environments Ej . Here we are replacing the separable state pe 
introdcued in Section [2771 with a correlated state u E . 

2.7.2.2 Model 2 



Each input state p u to the channel will act with a unitary interaction on the channel 
memory state, denoted \M){M\, and also an independent environment £*. This 



process is depicted in Figure [2721 



The output state from such a quantum memory channel can be expressed as follows 



$W(pH) = Tr ME U nMEn ---U lMEl (p^®\M)(M\ 



\0e---0 e )(0e---0 e \)UI MEi ---UI 



ME n 



(2.7.3) 



Note that if the unitaries acting on the state memory and environment can be writ- 
ten as Uk,ME k = UkE k UM then the memory can be traced out and we recover the 
memoryless channel. 



Quantum channels which have Markovian noise correlations are a class of channel 
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which can be represented by the above model. This class of channel is of particular 
interest to us and is discussed below. 



2.8 Markov processes and channel memory 

Next, we provide definitions [40] needed to describe quantum channels with classi- 
cal memory. 

2.8.1 Definitions and basic properties 

Let / denote a countable set and let Aj = F(X = i), where X is a random variable 
taking values in the state space /. Let P denote a transition matrix, with entries 
labeled p, | j. 

A Markov chain is given by a sequence of random variables X , . . . , X n _ 1 with the 
following property, 

P(^n-l = in-l\^n-2 = i n -2, ■ ■ ■ , X — Xq) = P(X n _! = i n _ 1 \X n _ 2 = % n -1) 

= Pi n -!\i n -2- (2-8.1) 

Equivalently, a discrete time random process denoted X n can be considered to be a 
Markov chain with transition matrix P and initial distribution A, if and only if the 
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following holds for i Q , . . . , % n _\ G /, (see Norris [40J Theorem 1.1.1) 

P(X = i ,X 1 =Zi,...,X n _i =Z n -l) = KPh\ioPi2\h • ■ ■ Pi n -x\in- 2 - (2.8.2) 

A state j is said to be accessible from state i, and can be written i — > j, if there 
exists n > such that 

F(X n =j\X o = i)>0. (2.8.3) 

The state i G / is said to communicate with state j G / if % — ¥ j and j — >■ z. This 
relation, denoted i o j, partitions the state space I into communicating classes. A 
Markov chain is said to be irreducible if the state space / is a single class. 

A state i has period L if any return to the state i occurs in multiples of L time steps, 
i.e. 

L = gcd{n : P(X n = i\X = i) > 0}. (2.8.4) 

Next we use the concepts to describe classical memory. 

2.8.2 Classical memory 

A channel, of length n, with Markovian noise correlations can be described as fol- 
lows 

$ W (p (n) )= J- (q ln _ lK _ 2 ...q nll0 )X to (^ o ^---^^ n ^)(p {n) ), (2.8.5) 

io-..in-i 

where qj\- t denotes the elements of the transition matrix of a discrete-time Markov 
chain, and {Aj} represents an invariant distribution on the Markov chain. 

26 



2.8. MARKOV PROCESSES AND CHANNEL MEMORY 

In later chapters we analyse two particular channels with classical memory, the peri- 
odic channel and a convex combination of memory less channels. Both are described 
below. 

A periodic channel can be described as follows 

1 L_1 

$ (n) ( p (n)) ^ ($ . g, $m g, . . . ^ $j+ni) ( p (n)) ? (2 g 6) 

i=0 

where <J>j are CPT maps and the index is cyclic modulo the period L. In this case 
qj\i = Oij, where 

{1, if j = (i+ I) mod L 
(2.8.7) 
0, otherwise. 

A convex combination of product channels is defined by the following channel 

M 

$W (pW) = J2 7* $f "(p (n) ), (2.8.8) 



i=l 



where 7j, (z = 1, . . . , M) is a probability distribution over channels 
$!,..., $ M . The action of the channel can be interpreted as follows. With proba- 
bility 7j a given input state p^ G B(H® n ) is transmitted through one of the mem- 
oryless channels. The corresponding Markov chain is aperiodic but not irreducible. 

In this case the elements of the transition matrix are qju = Sij, i.e. the transition ma- 
trix is equal to the identity matrix. Note that Ahlswede iHTI introduced the classical 
version of this channel and its capacity was proved by Jacobs [|42l . 
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Chapter 3 



Deriving a minimal ensemble for the 
quantum amplitude damping 
channel 



In this chapter we focus on obtaining the maximiser for classical information trans- 
mitted in the form of product- states over a noisy quantum channel. We consider in 
particular the problem of determining this maximiser in the case of the amplitude 
damping channel. The amplitude damping channel models the loss of energy in a 
system and is an example of a non-unital channel (see Section I2TTT) . The effect of 
the qubit amplitude damping channel on the Bloch (Poincare) sphere is to "squash" 
the sphere to the |0) pole, resulting in an ellipsoid. The Bloch sphere will be dis- 
cussed in more detail in Section [3. 1.11 and the resulting space of the output states 
from an example amplitude-damping channel, using the optimal input ensemble for 
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that channel, can be seen in Figure [3771 of this chapter. 

It is known in general that the maximising ensemble can always be assumed to 
consist of at most d 2 pure states if d is the dimension of the state space, but we show 
that in the case of the qubit amplitude-damping channel, the maximum is in fact 
obtained for an ensemble of two pure states. Moreover, these states are in general 
not orthogonal. This result is rather surprising, since nonorthogonal quantum states 
cannot be distinguished with perfect reliability. 

Note that Fuchs [43] has also described a particular channel, the so-called "splay- 
ing" channel, whose product state capacity is maximised using an ensemble of non- 
orthogonal states. 



3.1 The amplitude-damping channel 

The qubit amplitude-damping channel models the loss of energy in a qubit quantum 
system and is described, with error parameter < 7 < 1, by the following operation 
elements Q31 



/ 



En 



Jl 



1 

E 1 

y V^^l j V° ° / 



(3.1.1) 



Using the operation elements above, the qubit amplitude-damping channel can be 
expressed as follows 

®am P (p) = E oP E* + E lP El (3.1.2) 
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Note that since EqE + E\E\ = I, the operator $ amp is a CPT map and therefore 
a legitimate quantum channel. 



Acting on the general qubit state p, given by 



V 



a b 
b 1 — a 



(3.1.3) 



the output of the channel $ amp is given by 



^amp(p) 



a + (1 - 0)7 by/1 - 7 



V 



by/T^ (l_ a )(l_ 7 ) 



(3.1.4) 



/ 



The amplitude-damping channel can be interpreted as follows. Evaluating 



/ 



E oP E* 



by/T^ 



\ 



by/T^ (l- 7 )(l-a) 



(3.1.5) 



/ 



we can easily see that if the input state is given by p = |0)(0|, then the state is left 
unchanged by EopEg, with probability 1. However, if the input state is p = |1)(1|, 
the amplitude of the state is multiplied by a factor 1 — 7. 



On the other hand, 



E lP E{ = 7 



1 — a 



V 







(3.1.6) 



/ 



In this case, the input state p = |1)(1| is replaced with the state |0)(0| with proba- 
bility 7. 
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Therefore, 

$a mp (|0)(0|) = |0)(0|, (3.1.7) 

and 

<W|1>(1|) = 7 |0>(0| + (1-7)|1>(1|. < 3 - L8 ) 



The eigenvalues of $ amp (p) are easily found to be 



X amp± = i ( 1 ± V(l + 2a( 7 - 1) - 2 7 ) 2 - I|/.|-V; - 1) ) . (3. 1.9) 



Next we derive the product-state capacity of the qubit amplitude damping channel. 

3.1.1 Product-state capacity of the qubit amplitude damping 
channel 

Recall (Section T2.5. II) that the Holevo quantity for a channel $ is defined as 



X 



({ Pj , $(p,)}) = sU (y^pjpj) J - Y^PjSmPj))- (3-1.10) 



In the case of the amplitude-damping channel, given by Equation (|3.1.4I) . the Holevo 
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quantity is given as follows, 



X{{Pj,$amp{Pj)}) 



S 



E 



Pj ( a j + (1 - %b) Pj&j \/(l-7) 



\ 



J2Pi S 



Pjbj y/(l-j) Pj(l - aj) (1 - 7) y 
Oj + (1 - a_,-)7 6ja/1 _ 7 



6,-VT^ (l-o,-)(l-7) 



.(3.1.11) 



To maximise Equation (13.1.111) we will show that the first term is increased, while 

keeping the second term fixed, if each pure state pj is replaced by itself and its 

/ 
mirror image in the real 6-axis. In other words, replacing pj = 

( 



\ 



bj (1 - aj) 



associated with probability pj, with the states pj 
( 



\ 



h 



h j (i - a i, 



and p'. 



both with probabilities Pj/2, will increase Equation (13.1.111) . 



V 



The Bloch sphere (also known as the Poincare sphere) is a representation of the 
state space of a two-level quantum system i.e. a qubit. Pure states (corresponding 
to the extreme points in the (convex) set of density operators) are given by points 
on the surface of the sphere. 



An example of antipodal states is shown in Fi gure [3TT1 below, which depicts a two- 
dimensional cross-section of the Block sphere. Here, the state p 1 has been replaced 
by itself and p[, similarly for p' 2 . 



As remarked above, the maximum in Equation (12.6.11) can be achieved by a pure 
state ensemble of (at most) d 2 states, where d is the dimension of the input to the 
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Figure 3.1: An example of two pairs of antipodal pure states. 



channel. In general, the states pj must lie inside the Poincare sphere 



«-i + W<\ 



(3.1.12) 



and so, the pure states will lie on the boundary 



\b\ 2 = \^\b\ 2 =o(l-o) 



(3.1.13) 



We first show that the second term in Equation (13.1.101) remains unchanged when 
the states are replaced in the way described above. Indeed, since the eigenvalues 
(13.1.91) depend only on \b\, we have S (®(pj)) = S ($(pi)) and therefore, 



E i I s Wp>)) + s Wj))] = J2pj s ( $ (^)) 



(3.1.14) 



Secondly, we prove that S ( $ ( JT pjPj )) is in fact increased by replacing each 
state with itself and its mirror image, each with half their original weight. Indeed, 



33 



3.1. THE AMPLITUDE-DAMPING CHANNEL 



as S is a concave function, 



s(£f*te + 4)]>i 



2 

3 



^WEm + 5$ E^ 



(3.1.15) 



and again, since 5 (^(EjPjPj)) = 5 ( $ (EjPiP^ 

We can conclude that the first term in Equation (13.1.101) is increased with the second 
term fixed if each state pj is replaced by itself together with its mirror image. 

Remark 3. It follows, in particular, that we can assume from now on that all bj are 
real as the average state ^ ^{pj + p'j) has zero off-diagonal elements, whereas 
the eigenvalues of$>(pj) only depend on \bj\. 



3.1.1.1 Convexity of the output entropy 

We concentrate here on proving that, in the case of the amplitude- damping channel, 
the second term in the equation for the Holevo quantity is convex as a function of 



the parameters cij, when pj is taken to be a pure state, i.e. bj = v %(1 — %)• Thus 
S (&(pj)) is a function of one variable only, i.e. S(aj). It is given by, 

S(a j ) = S($ amp (p aj )), (3.1.17) 
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where 



Pa 



( 



\ 



y/afi ~ a ) 



'a(l - a) 
1-a 



\ 

J 



(3.1.18) 



that is, 



0"(a) = §amp{pa 



a + (1 - 0)7 A/a(l - a)i/l - 7 

. v/a(l-a)v/l-7 (l-a)(l-7) , 



(3.1.19) 



Inserting b 2 = a(l — a) into Equation (13.1.91 ) the eigenvalues for the amplitude- 
damping channel can be written as 

Km P ± = \ (l ± v /l-4 7 (l- 7 )(l-a) 2 ) . (3.1.20) 



Denote 



;r 



Vl-4 7 (l- 7 )(l-a) ; 



(3.1.21) 



Then 



5(a) 



1 + x\ fl + x 

log 



l-x\, (\ 
log 



.r 



(3.1.22) 



We prove that S" (a) is positive. A straightforward calculation yields 



S »(a)ln2 = ^(1-7)^/1+^ 4 7 (1 _ 7 ) 



:r° 



1 — x 



;r- 



27d-7)\/l ln ^l±,, 2 



x z 



x \ 1 — x 



(3.1.23) 
(3.1.24) 
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Since the first term in the above equation is positive, the problem of proving the 
convexity of S(a) reduces to proving that, 



In I ^^ 1 > 2.r. 
1 — x 



(3.1.25) 



This is easily shown. Note that < x < 1. Both functions are plotted in Figure [3721 
below. We conclude that S"(a) is positive and therefore S(a) is convex. Writing 



f (x) , g(x) 

4r 




0.2 0.4 0.6 0.8 1.0 




Figure 3.2: The functions f(x) = In (y^f) and g(x) = 2x plotted for < x < 1. 



p a = V pj p ap with a = J2j Pj a j an( l since the entropy function S is convex in a 



we have 



X({Pj, ^amp(pj)}) = S($ amp (p a )) - ^Pj S(dj) 

3 

< S($ amp {p a ))-S{a). 



(3.1.26) 



The capacity is therefore given by 



X*($ 



amp J 



max 

as [0,1] 



S[±(a(a)+t/(a)) 



- S(a(a)) 



(3.1.27) 
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where a (a) is given by 



a(a) = ^amp(Pa) 



and hence 



a + (1 - a) 7 A/a(l - a)\A - 7 
v v/^l-^v^^ (l-a)(l- 7 ) y 



(3.1.28) 



1 | a + (1 - 0)7 

-(a(a) + a(a)')= | (3.1.29) 

(l-a)(l- 7 ) 



We have proved that S(a) is convex. Therefore —S(a) is concave. On the other 
hand, it follows from the concavity of S that the first term is also a concave function 
of a. 

It follows that Xad(o-) = S (|(cr(a) + cr'(a))) — S(a(a)) is a concave function, and 
its maximum is achieved at a single point. The maximising value of a is given by 
the transcendental equation Xad( ) = an ^ can only be computed numerically. In 
Figure [33] we plot Xad{o) as a function of a. 



The maximising a for fixed 7 G [0, 1] is plotted as a function of 7 in Figure I3~4l 
Note that a max > 0.5 for all 7. This is easily proved: The determining equation is 



(l- 7 )(l-q) , 27(1-7X1-1) , l + s n ni . m 
X (o) = (1 - 7) In ■ — -. v ^ In- = 0. (3.1.30) 

a + 7(1 — a) x 1 — x 



Since x{ a ) is concave, the statement follows if we show that x'(0) > an d 
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Figure 3.3: Xad{o) for < 7 < 1 plotted over a. 




0.2 0.4 0.6 



Figure 3.4: Maximising a's for < 7 < 1 for the amplitude-damping channel. 



X'(|) > 0. But, if a = then x = ^1-4.7(1 - 7) = |1 - 2 7 | 



so 



vm h m l ~7 ,27(1-7), 1 + |1-2 7 | 
X (0) = (l-7)ln— -+ |1 _ 27| ln T 



-|1-2 7 | l-2 7 



1 — 7 , 1 — 7 

r In > 0. 



7 



(3.1.31) 



For a = \ we have x = a/1 — 7 + 7 2 and 



x'(0.5) = -(l- 7 )lni^ + ^^ln 1 + X 



1-7 



x 



1 — x 



(3.1.32) 
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This is also positive because x > 7 and the function 



1 1 + x tanh (x) 

— In = 

2x 1 — x x 



(3.1.33) 



is increasing. The resulting capacity is plotted in Figure 




Figure 3.5: Xad(gWt) vs. 7. 



3.1.2 Non-orthogonality of the maximising ensemble 

We have proved that the Holevo quantity for the amplitude-damping channel can 
be maximised using an ensemble of just two pure states. Concentrating on the first 
term, the average of two mutually orthogonal states will lie in the centre of the 
Bloch sphere, i.e. at a = \. However, we have proved that a max > 0.5 for all 7. 
This implies that the product-state capacity of each amplitude-damping channel is 
achieved for an ensemble of non-orthogonal states. 

In [|43l . Fuchs compares the product-state capacities for the "splaying" channel with 
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both orthogonal and non-orthogonal input states, for certain values of the error pa- 
rameter. We will now compare the Holevo quantity of the qubit amplitude-damping 
channel using orthogonal and non- orthogonal input states. 

Example: We choose 7 = 0.2, indeed any choice of parameter 7 e (0, 1] will do, 
and we first take a = 0.5, representing an orthogonal ensemble. We find that the 
corresponding Holevo quantity is 



X± « 0.720726. 



(3.1.34) 



Again choosing 7 = 0.2 but this time solving the transcendental equation Xad(°) 
numerically to find a max « 0.567214, we get 



X* = Xnon± « 0.731645. 



(3.1.35) 



Oince CLmax 

where 



0.567214, this implies that the optimal input states p± = |o±)(o±| 



/ 



l*±> 



y/a~r, 



\ 



±\A - a r, 



\ 



J 



\ 



0.753133 

±0.657862 



(3.1.36) 



The angle between the two states p and p' is approximately 83 degrees. This demon- 
strates that the optimal ensemble to the qubit amplitude-damping channel, with 
error-parameter 7 = 0.2, consists of non-orthogonal states. 



Figure [3~!6l shows that x( a max), i-£- the actual product-state capacity, differs from 
x(0.5) except when 7 = and 7 = 1. This is due to the fact that 7 = x at 7 = 1 
and since the error parameter 7 is at its maximum value we have x( a max) = 0. 
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At 7 = 0, the first term of the Holevo quantity for the amplitude-damping channel 



s(*AD(£ j PiPi))=S 



\ 



a 
fl -a) 



which is maximised at a = 0.5. 




Figure 3.6: Xad(gWt) vs. 7 in blue and Xad{cl = 0.5) vs. 7 in red. 

Figure l3771 below demonstrates the amplitude damping channel with 7 = | with the 
optimal input states represented in blue and the corresponding output state in red. 

Note that a qubit channel maps the Bloch sphere to an ellipsoid and the action of a 
unital channel on the Bloch sphere results in an ellipsoid which is centered at the 
origin of the Bloch sphere H44U45H . 
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Figure 3.7: Optimal input states (blue) to the amplitude-damping channel with 7 
0.5 and the resulting output states from the channel (red). 

3.2 The generalised amplitude-damping channel 



The operation elements for the generalised amplitude-damping channel [14J are as 
follows 



e = Vp 



'1 N 



V 



v/T^ 



£i = v^ 



Jl 



\ 







(3.2.1) 



^fT zr l 



E 2 = ^l-p\ |, E 3 = y/l-p 

1 






(3.2.2) 



Remark 4. /Vote f/zaf we recover the traditional amplitude -damping channel for 
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p — 1. 



Therefore the generalised amplitude-damping channel, <&gad, acting on the qubit 
state 



P 



( 



\ 



a b 
b 1 — a 



(3.2.3) 



can be written as 



$gad(p) = E oP E* + E lP E\ + E 2 pE* 2 + E 3 pE* 3 



a + PI ~ a l a/o(1 — a)y/l — 7 



a/gi(1 - a) a/1 -7 a(7-l)-p7 + l 



\ 



/ 



(3.2.4) 



The eigenvalues of ^gad(p) are, 



A± = - (l ± a/1 + 4a 2 7 2 - 4a 2 7 - 8ap7 2 + 8ap7 + 4p 2 7 2 - 4p7^ . (3.2.5) 



Again, letting 



x = a/1 + 4a 2 7 2 — 4a 2 7 — 8ap7 2 + 8ap7 + 4p 2 7 2 — 4^7, (3.2.6) 



we have 



S{a) 



1 + x\ fl+x 
log 



1 — x\ ( \ — X 
log 



(3.2.7) 



Now, 



^)ln(2) = - 27(1 ' 7)(P ~ a) ln 1 + " 



x 



1 — X 



(3.2.8) 
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and the second derivative 



S(a)"m(2) = 27(1 3 7) ((1 - C )l n i±^ - 2x \ ° f \ . (3.2.9) 

x A 1 — x 1 — x 



where c = A^p{l — p). 

Since S"(a) can be shown to be positive, we can conclude that S(a) is a convex 
function of a and since the first term in the Holevo quantity is concave we deduce 
that x(a) is concave. Applying the technique used for the amplitude damping chan- 



nel in Section l3TI i.e. replacing each state in the ensemble by itself and it's antipode 
with half the original probability, we can maximise x( a ) over an ensemble contain- 
ing two pure states which are specified completely, for each fixed 7 and p, by a 
single value a. 

Figure l3T8l depicts the Holevo capacity x* as a function of p for values of 7 e [0, 1]. 
We can see that x(p) varies more at larger values of 7. This is due to the fact that 
p only occurs with 7 in the expression for &gad in Equation (13.2.41) . Figure 13.91 
below shows the Holevo x( a ) quantity for fixed 7 = 0.5 and for p fixed at various 
values between and 1. The symmetry between p and 1 — p about p = 0.5 can 
clearly be seen. 
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Figure 3.8: The Holevo capacity x* (p) for the generalised amplitude-damping chan- 
nel as a function of p and for various values of < 7 < 1. 

3.3 The depolarising channel and the Holevo 
quantity 

The state of a qubit sent through the depolarising channel is replaced by a com- 
pletely mixed state with probability A, as follows 



A A (p) = (l-A)p + A(-). 



(3.3.1) 



We demonstrate a method of obtaining the product-state capacity of a qubit depo- 
larising channel using a minimal input ensemble, analogous to the above argument 
for the amplitude-damping channel. The depolarising channel acting on the qubit 
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Figure 3.9: The Holevo x quantity for the generalised amplitude damping channel, 
with 7 = 0.5 plotted for various values of < p < 1. 



state p 



( 



\ 



a b 



b 1 — a 



can be written as, 



Aa(p) 



;i-A)a + f 



6(1 - A) 



6(1 -A) (l-A)(l-o) + | 



(3.3.2) 



Note that for pure input states b = yap — a) and the corresponding eigenvalues 
are 

A+ = ^, A- = 1 - \. (3.3.3) 

For a pure input state pj, the second term in the Holevo quantity S ($(pj)), given 
by Equation l3.1.1Q[ is a function of one variable only, in other words >S($(p a )) = 

S (a,j), where 

/ 

a 



Pa 



V^(T 



\ 



V 



'a(l — a) 1 — a 



(3.3.4) 



/ 
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QUANTITY 

Denoting XDe P ({Pj, Pj}) by XDe P , the Holevo quantity for the depolarising channel 
with a pure state ensemble is given by 

c v^ | Pi (( X ~ A K' + I) Vj^a(l-a){\ - A) 

XDep — O 2_^ 

3 y Pjy /a{l-a){l-X) p,((l-A)(l- 0i )H-|) J 

~ Y^PjS( aj ). (3.3.5) 

5 

The product-state capacity of the channel is achieved by maximising Equation 
(13.3.51) . We replace each state in the manner described in Section [3. l.ll Since the 
eigenvalues given in Equation (13.3.31) do not depend on the state, the second term 
in Equation (13.3.51) is also independent of the input state. Then, by concavity of the 
entropy, the first term in Equation (13.3.51) is increased and hence Equation (13.3.51) 
increases. From the concavity of the entropy S, it follows that the first term in the 
above equation is a concave function of a,j. The Holevo quantity XDe P ({Pj, Pj}) is 
concave in the input state p a and its maximum can therefore be achieved at a single 
point. The Holevo quantity now becomes 

XDe P ({ PvPl }) = S Q ( A A (p) + A A (//))) - H Q 

= #((1- A) a +£)-#(£) (3.3.6) 

where H(p) = —p\og(p) — (1 — p) log(l — p) is the binary entropy. 

As the second term above is independent of the input ensemble, we concentrate on 
maximising the first term to obtain the product- state capacity. The average of any 
two mutually orthogonal states will lie in the center of the Bloch sphere, in other 
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QUANTITY 

words at a = \. We therefore obtain equal eigenvalues, A = Ai = |, for the first 
term and the entropy is a maximum. The product-state capacity can therefore be 
achieved with an ensemble containing any pair of orthogonal pure states p a and p' a 
with equal probability |. 

We can therefore again maximise over a minimal ensemble of two mirror image 
states with equal probability ~. The maximum is clearly attained at a = | as can be 
seen in Figure I3T01 The value for x*(A A ) is shown in Figure [3TTTI as a function of 
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Figure 3.10: The Holevo quantity for the depolarising channel as a function of a, 

for A e [0,1]. 

A. The depolarising channel can therefore be considered to be rotationally invariant. 
The product-state capacity for the qubit depolarising channel is xhe P = 1 — H (|) . 
In fact, it was proved by King [46], that this is also the classical capacity of the 
channel. In d dimensions the capacity of the depolarising channel is 



X* (A A ) = log (d)-S min (A; 



(3.3.7) 
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Figure 3.11: x*(a max ) vs. A. 



where S min is defined as, 



S min (A x ) = miS(A x ). 

p 



(3.3.8) 



3.4 Summary 

To summarise, we have introduced a method for calculating the product state capac- 
ity for the qubit amplitude damping channel using a minimal ensemble containing 
two antipodal states. We analysed the behaviour of the product state capacity of the 
channel as a function of its error parameter and also showed that the product state 
capacity of the channel is achieved using non-orthogonal states. 

Next we discussed the generalised amplitude damping channel and the depolaris- 
ing channel. We have shown that the technique used to calculate the product state 
capacity of the qubit amplitude damping channel can also be used to calculate the 
product state capacity of these channels. 
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Chapter 4 



The classical capacity of two 
quantum channels with memory 



4.1 Introduction 

The problem of determining the classical information-carrying capacity of a quan- 
tum channel is one which has not been fully resolved to date. In the case where the 
input to the channel is prepared in the form of non-entangled states, the classical 
capacity can be determined using a simple formula, that is, the supremum of the 
Holevo quantity introduced in Chapter 2. 

However, if entanglement between multiple uses of the channel is permitted, then 
the channel capacity can only be determined asymptotically. Moreover, the pro- 
posed additivity conjecture which promised to provide such a "single-letter" for- 
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mula for the classical capacity of a channel with general input states, has recently 
been disproved even in the case of memoryless channels ||S). 

Note, that a channel is said to be memoryless if the noise acts independently on 
each state sent over the channel. If $( n ) = $ (g,n is a memoryless channel, then the 
product-state capacity is given by the supremum of the Holevo quantity evaluated 
over all possible input state ensembles. This is also known as the Holevo capacity 
X*($) of the channel. 

We remark that, Shor iPTTTl (see also Pomeransky 11481 and Fukuda [49]) proved 
that the additivity conjectures involving the entanglement of formation [50], the 
minimum output entropy [|44l . the strong superadditivity of the entanglement of 
formation and the Holevo capacity [Q3|2l are in fact equivalent. 

The additivity conjecture of the Holevo capacity, discussed in detail in Section l4~2l 
states that the rate at which classical information can be transmitted over a quantum 
channel cannot be improved by sending entangled codewords over copies of the 
channel. 



Additivity of the Holevo capacity has been proved for unital qubit channels 115111 . 
entanglement-breaking channels [52], and the depolarising channel [46] . However 
Hastings [8] recently provided a counter example to the above conjecture using ran- 
dom unitary channels, thereby disproving the conjecture for memoryless channels. 

We are concerned here with the classical capacity of two quantum channels with 
memory. Note that it has been shown that the capacity of certain channels with 
memory can be enhanced using entangled state inputs. See [I53l - l60l . 
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In particular, Macchiavello and Palma 11541 proved that, although the product-state 
capacity of the memoryless depolarising channel is additive [46], the product-state 
capacity of the depolarising channel with partial noise correlations is in fact non- 
additive. This result contributes to our motivation for considering the classical ca- 
pacity of quantum channels with memory. 

Memoryless channels, i.e. channels which have no correlation between noise act- 
ing on successive channel inputs, can also be seen to be unrealistic, since real-world 
quantum channels may not exhibit this independence and correlations between er- 
rors are common. Noise correlations are also necessary for certain models of quan- 
tum communication (see [|6TI . for example). These channels are known as bosonic 
channels and such channels have received much attention in recent years. More- 
over, the classical and quantum capacities of lossy bosonic channels were recently 
evaluated IfTTTl . Note that the classical capacity of a bosonic memory channel with 
Gauss-Markov noise has also been recently investigated [62J. 



In this Chapter we consider the classical capacity of two particular types of chan- 
nels with memory consisting of depolarising channel branches, namely a periodic 
channel and a convex combination of memoryless channels. 

In Datta and Dorlas derived a general expression for the classical capacity of a 
quantum channel with arbitrary Markovian correlated noise. 

We consider two special cases of this channel, that is, a periodic channel with depo- 
larising channel branches and a convex combination of memoryless channels, and 
we prove that the corresponding capacities are additive in the sense that they are 
equal to the product-state capacities. 
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To prove the additivity of the product-state capacity of the periodic channel with 
depolarising channel branches, we use two properties of the memoryless chan- 
nel, namely the additivity of the memoryless depolarising channel [|46l . and the 
fact that the product-state capacity of the memoryless depolarising channel can be 
achieved using an ensemble containing any pair of orthogonal pure states. The latter 
is demonstrated in Section [3731 

We demonstrate in Section l4~6l that we cannot extend the technique used in the proof 
for the periodic channel with depolarising channel branches to the amplitude- 
damping channel. 

On the other hand, to prove the additivity of the product-state capacity of a con- 
vex combination of depolarising channels we only need the additivity of the Holevo 
capacity for the memoryless depolarising channel. This result can therefore be ex- 
tended to include convex combinations of channels which have been proved to be 
additive in the memoryless case. These include the unital qubit channels [TSTTl and 
the entanglement-breaking channel Ii52ll . 



A convex combination of memoryless channels was discussed in 11631 and can be 
described by a Markov chain which is aperiodic but not irreducible. Both channels 
are examples of a channel with long-term memory. See note on Markov chains and 
channel memory in Section [2T8l 

We also consider the product- state capacity of a convex combination of a two depo- 
larising channels, two amplitude-damping channels and a depolarising channel and 
an amplitude-damping channel. We show in the case of one depolarising channel 
and one amplitude-damping channel that the corresponding product- state capacity, 
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which was shown in [3 J to be given by the supremum of the minimum of the cor- 
responding Holevo quantities, is not equal to the minimum of their product- state 
capacities. 



4.2 Classical capacity 

Using product-state encoding, i.e. encoding a message into a tensor product of n 
quantum states on a finite-dimensional Hilbert space "H, each state can be transmit- 
ted over a quantum channel given by a completely positive trace-preserving (CPT) 
map $( n ) on B(H® n ). The associated capacity is known as the product-state capac- 
ity of the channel. See Section [2761 

On the other hand, a block of input states could be permitted to be entangled over n 
channel uses. The classical capacity is defined as the limit of the capacity for such 
n-fold entangled states divided by n, as n tends to infinity. If the Holevo capacity 
of a memoryless channel is additive, then it is equal to the classical capacity of that 
channel and there is no advantage to using entangled input state codewords. The 
additivity conjecture for the Holevo capacity of most classes of memoryless chan- 
nel remains open. However, the classical capacity of certain memoryless quantum 
channels have been shown to be additive (see [|46ll5Tll52l . for example). On the 
other hand, there now exists an example of a memoryless channel for which the 
conjecture does not hold, see BU. 



It was first shown in [29] that for some channels, it is possible to gain a higher rate 
of transmission by sending entangled states across multiple copies of a quantum 
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channel. In general, allowing both entangled input states and output measurements 
and with an unlimited number of copies of the channel, the classical capacity of $ 
is given by ll64l 

C($) = lim -x*($ (n) ), (4.2.1) 

n— >oo Tl 



where 



X *(d> (n) ) = sup 

r (n) (n)i 

{pi >pj- } 



"j "'j 



S (*<»> fe^Vf]] - £^ ( $(n) (4 n) 



(4.2.2) 
denotes the Holevo capacity of the channel $ (n ) with an n-fold input state ensemble. 

The Holevo capacity of a channel $ is said to be additive if the following holds for 
an arbitrary channel \& 

x * ($ <g, *) = x * ($) + x * ($) . (4.2.3) 

In particular, if we can prove that the Holevo capacity of a particular channel is 
additive then 

X* O 0n ) = n X * m , (4.2.4) 

which implies that the classical capacity of the memoryless channel $® n is equal to 
the product-state capacity, that is, 

C(<&) =**(<&). (4.2.5) 

This will imply that the classical capacity of that channel cannot be increased by 
entangling inputs across two or more uses of the channel. 
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Here we use the additivity of the memoryless depolarising channel to prove Equa- 
tion (14.2.51) for a periodic channel with depolarising channel branches and for a 
convex combination of depolarising channels (replacing the x*($) term in Equa- 
tion (14.2.51) with the appropriate formula for calculating the product-state capacity 
for the particular channel). 



4.3 The periodic channel 



A periodic channel acting on an ra-fold density operator has the form 

1 L_1 

fi$ (p (n) ) = i J2 (^ ® n *+i ® ■ • • ® "*+«-!) 0° w ) ' (4 - 3 - 1} 

j=0 

where f^ are CPT maps and the index is cyclic modulo the period L. 

We denote the Holevo quantity for the z-th component f2, ; of the channel by 



X 



({ Pj , Pj }) = S r^pjOt ( Pj )j - Y^PjS (tUfa)) . (4.3.2) 



Since there is a correlation between the noise affecting successive input states to 
the periodic channel (14.3.11) . the channel is considered to have memory and the 
product-state capacity of the channel is no longer given by the supremum of the 
Holevo quantity. Instead, the product -state capacity of this channel is given by the 



56 



4.3. THE PERIODIC CHANNEL 



following expression 



] L ~ 1 

C P (Q) = - sup ^2xi({pj,Pj})- (4-3.3) 

L {pj,p 3 } i=0 



The proof of the above formula (direct part) is provided in Appendix B. The strong 
converse is discussed in Chapter 5. 

Next, we introduce the depolarising channel and investigate the product-state ca- 
pacity of a periodic channel with depolarising channel branches. 

4.3.1 A periodic channel with depolarising channel branches 

Recall that the rf-dimensional quantum depolarising channel can be written as fol- 
lows 

A A (p) = Ap+^-J (4.3.4) 

where p E B (H) and I is the d x d identity matrix. Note that in order for the 
channel to be completely positive the parameter A must lie within the range 



< A < 1. (4.3.5) 



d 2 -l 



Output states from this channel have eigenvalues (A + ^p) with multiplicity 1 and 
(^r) w ^ m multiplicity d — 1. 
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The minimum output entropy of a channel $ is defined by 



S min ($)=infS($(p)). (4.3.6) 

p 



Using an ensemble containing orthogonal pure states, with uniform distribution, 
results in the average input state p = | and the product-state capacity of the depo- 
larising channel is given by 

X*(A A ) = log(d)-^ m in(A A ), (4.3.7) 

where the minimum entropy is also attained for any set of orthonormal vector states, 
and is given by 



*„„■„( A A ) = -(A — J log f AH ^— 



Next we show that the product- state capacity of a periodic channel with L depolar- 
ising channel branches is given by the sum of the maximum of the Holevo quantities 
of the individual depolarising channels, in other words we show that 

1 L-l 1 L-l 

- sup ^ Xi({Pj, Pj}) = j X] sup Xi({Pj:Pj})- (4.3.9) 

Let A Al , Aa 2 , • • • , Aa l denote (i-dimensional depolarising channels with respective 
error parameters Ai, A 2 , • • • ,X L . Using the product-state capacity given by Equa- 
tion (14.3.71) and since every depolarising channel can be maximised using a single 
ensemble of orthogonal pure states independently of the error parameter (as shown 
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in Section l3~3l) . the right-hand side of Equation (14.3.91) can be written as 



1 L_1 1 

- y^ su p xi({pj,Pj}) = l ~j 



=o fe'Pj} 



&min \^\\) i" ' ' ' + O m i n (A; 



(4.3.10) 



Clearly, the left-hand side of Equation (14.3.91) is bounded above by the right-hand 
side 

1 L-l 1 L-l 

- sup 5Zxi({Pj,Pj}) < yj^ sup Xi({Pj,Pj})- (4.3.11) 

On the other hand, choosing the ensemble to be an orthogonal basis of states with 
uniform probabilities, i.e. taking {pj, pj} to be the optimal-ensemble, we have 

L-l L-l 

75>«({Pi»Pi}) = 1 -75>.ni»( A ;J- (4-3.12) 



j=0 



i=0 



We can now conclude that Equation (14.3.91) holds for a periodic channel with L 
depolarising branches of arbitrary dimension. 



4.3.2 The classical capacity of a periodic channel 

We now consider the classical capacity of the periodic channel, Vt per , given by Equa- 
tion (14.3.11) . where VLi = A\ t are depolarising channels with dimension d. Denote 
by \I/q , . . . , 'fy L-i me following product-channels 



^(n) = A 



A, 



A 



Ai+n-l ' 



(4.3.13) 



where the index i is taken modulo L. 
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We define a single use of the periodic channel, Vt per , to be the application of one 
of the depolarising maps A A . . If n copies of the channel are available, then with 
probability \ one of the product branches W\' will be applied to an n-fold input 
state. 

We aim to prove the following theorem. 

Theorem 4.1. The classical capacity of the periodic channel Q per with depolarising 
channel branches is equal to its product-state capacity, 

1 L_1 

O yziper) t_/p y\Lp er j 1 — ■ > O m j n ^AA,\j J. 

i=0 

To prove Theorem 14.11 we first need a relationship between the supremum of the 
Holevo quantity x* an d the channel branches fy\ . King [46] proved that the supre- 
mum of the Holevo quantity of the product channel A A <g> ^ is additive, where A A 
is a depolarising channel and \l/ is a completely arbitrary channel, i.e., 

X* (A A (g) *) = X * (A A ) + X * (*) • (4.3.14) 

It follows immediately that 



; (n-l) 
■i+1 



x*(*? } ) = X* (A A J + x* (tfj 

L-l 

= £ X * (A A J + X * (*<" ~ L) ) . (4.3.15) 



Next, we use this result to prove Theorem 1. 
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Proof. The classical capacity of an arbitrary memoryless quantum channel fi is 
given by 

C (fi) = lim - sup X (\pj, ^ (n) U n) ) }) ■ (4-3.16) 

In Section 14.3.1 1 we showed that the product-state capacity of the periodic channel 
il per , with depolarising channel branches denoted A A ., can be written as 



1 L_1 
C p (Q per ) = T Y,X*(^)- (4-3.17) 



j=0 



Using the product channels tyf 1 ' ( pf' ) defined by Equations (14.3.131) . the periodic 



channel Q per can be written as 



i=0 

Since it is clear that 

C(Q per )>C p (Q per ), (4.3.19) 

we concentrate on proving the inequality in the other direction. 
First suppose that 



1 

c (n per ) > T J2** ( A ^) + e ' ( 4 - 3 - 2 °) 



j=0 

for some e > 0. Then 3n such that if n > n , then 



s , ,f,T„ * (K'- °» ("5"') }) £ i E ** < A ^> + 5- (43 - 21) 

The supremum in Equation (14.3.211) is taken over all possible input ensembles 
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{.Pj i Pj }■ Therefore, for n > n , there exists an ensemble {p? , pf } such that 



j=0 



The Holevo quantity can be expressed as the average of the relative entropy of 
members of the ensemble with respect to the average ensemble state, 



X 



{{Pk, Pk}) = J^ Pk S f p k 1 1 ^ Pk Pk ) , (4.3.23) 

k \ k J 



where, S (A 1 1 B) = Tr (A log A) — Tr (A log S), represents the relative entropy of 
A with respect to B. 



Vedral 11651 has argued that the distinguishability of quantum states can be measured 
by the quantum relative entropy. Since the relative entropy is jointly convex in its 
arguments lfl4"l . it follows that the Holevo quantity of the periodic channel VL per is 
also convex. 

Therefore, by (14.3.181) . 

X ({pf », n« {Pt ') }) Zj £ X ({pf , *<"' (p<">) }) ■ (43.24) 

j=0 

Using Equation (14.3.221) we thus have 

\ £ x- (a,,) + § < ^ E x- ({p<"> *!"' (#') }) • («■*) 



8=0 1 = 
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It follows that there is an index i such that 



L-X 



\ £ X- (M) + § < \x ({*><"> *!"' (/><">) }) • (4.3.26) 

i=0 

But Equation (14.3.151) implies that 



i=0 



Therefore the inequality (14.3.261) and hence the assumption made in Equation 
(14.3.201) cannot hold, and 

C(n per )<C p (n per ). (4.3.28) 

The above equation together with Equation (14.3.191) yields the required result. □ 



4.4 The classical capacity of a convex combination 
of memoryless channels 



In 11631 the product-state capacity of a convex combination of memoryless channels 
was determined. Given a finite collection of memoryless channels $ 1; . . . , $ M with 
common input Hilbert space % and output Hilbert space /C, a convex combination 
of these channels is defined by the map 

M 
$ (n) ( p (n)) = J2li®? n (P in) ), (4-4.1) 

i=X 
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where 7*, (i = 1, . . . , M) is a probability distribution over the channels 
$1, . . . , $m- Thus, a given input state p^ E B{l-L® n ) is sent down one of the 
memory less channels with probability 7*. This introduces long-term memory, and 
as a result the (product- state) capacity of the channel ^ n ^ is no longer given by the 
supremum of the Holevo quantity. Instead, it was proved in Il63l that the product- 
state capacity is given by 



C p ($) = sup 
{pj>pj} 



M 



/\x({pj,Mpj)}) 



»=i 



(4.4.2) 



Again, let A A . be depolarising channels with parameters A i; and let $ ran d denote 
the channel whose memoryless channel branches are given by A] where 



A (n) = A ®n_ (4A3) 



Since the capacity of the depolarising channel decreases with the error parameter 
the product-state capacity of § Tan d is given by 

M / M \ 

C p {%an d ) = /\ X*(A A J = X * l\J \A , (AAA) 

i=l \i=l J 

where 

X*(A):= X *(A A ). (4.4.5) 

We aim to prove the following theorem. 

Theorem 4.2. The classical capacity of a convex combination of depolarising chan- 
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nels is equal to its product-state capacity 

C (&rand) = Cp {^rand) 



Proof. According to [63 J the classical capacity of this channel can be written as 
follows 



C ($ rW ) = lim - sup A X (\ P { ;\ A< B > U n) ) }) . (4.4.6) 

Suppose that 



{p) \P) '}l=l 



M 



C(<S> rand )>/\x*(A Xt ) + e } (4.4.7) 



for some e > 0. 

Then 3 no, such that if n > n , then 



1 M M 

- sup f\x{{p$\^ ) [p$ ) )\)>f\x m ^)+*- (4.4.8) 



{p) \p)>} 



Hence, for n > n Q there exists an ensemble {Pj , pi } such that 



M M 



\j\x ({pf \ Af > (pf >) }) > j\ X *(A Xi ) + e. (4.4.9) 



8=1 1=1 



But King 11461 proved that the product state capacity of the depolarising channel is 
equal to its classical capacity, therefore 

X*(Al n) )=nx m (A*)- (4-4.10) 
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In other words, \ ( < Pj,A\ n ' [py j \ j is bounded above by n x* (A A J. Now, if 
io is such that 

M 

/\ X *(A Xi )=x*(A x J, (4.4.11) 



then 



i=i 



n . 



< X*(A A ,). (4.4.12) 



Therefore 

M M 



^A*({pf'. A "(^ , )})sA**(AA,). W.4.13) 



1=1 1=1 

This contradicts the assumption made by Equation (14.4.71) and therefore 

M 



C ($ mnd ) < A **( A aJ = C P i^rand) • (4.4.14) 



i=l 



On the other hand, it is clear that C (<& r and) > C p (® ran d) , and therefore 

C (&rand) = Cp (&rand) ■ D 

Remark 5. Note that, in contrast to the proof of Theorem 1, the proof above does 
not rely on the invariance of the maximising ensemble of the depolarising channel. 
The proof uses the additivity of the Holevo capacity of the depolarising channel 
(see Equation A4.4.10\) ) and the result can therefore be generalised to all channels 
for which the additivity of the Holevo capacity has been proved. 
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4.5 Convex combinations of two memoryless 
channels 

Recall that it was shown in [3] that the product-state capacity of a convex com- 
bination of memoryless channels, denoted $( n ), is given by (14.4.21) . Note that the 
following always holds 

C p (^) < A&tf. (4.5.1) 

We investigate whether equality holds for the expression above in the following 
three cases: a convex combination of two depolarising channels, two 
amplitude-damping channels, and a convex combination of one depolarising and 
one amplitude-damping channel. 

4.5.1 Two depolarising channels 

In the case of a convex combination of two depolarising qubit channels Aa ; (p) = 
(1 — \)p + A; (|) with parameters Ai and A 2 , we have 

C(*S J = X*( A *) A X*^) = X*( A * V A ^ ( 4 - 5 - 2 ) 

Indeed, since the maximising ensemble for both channels is the same (see 13.31) . 
namely two projections onto orthogonal states, this also maximises the minimum 

Xi A X2- 
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4.5.2 Two amplitude-damping channels 

A convex combination of amplitude-damping channels is similar. In that case, the 
maximising ensemble does depend on the parameter 7, but as can be seen from 



Figure [331 for any a, Xad(o-) decreases with 7, so x(li) A x(l2) = x(li v 72) and 
we have again, 



C ^L) = X*(7i) A X*(l2) = X*(7i V 72). (4.5.3) 



The fact that Xad{cl) decreases with 7 can be seen as follows. The derivative with 
respect to 7 is given by 

d x ( . a + 7(l-a) , (27-l)(l-a) 2 , l + x 

-- = -l-a)ln- -T- - + In- . (4.5.4) 

07 (1 — 7)(1 — a) x 1 — x 

For 7 < |, both terms are negative if -^ > 1 — 27. Otherwise, the first term is 
positive and we remark that 

x> (l-2 7 )(l-a). (4.5.5) 

So that it suffices if 

x>y = 1 -27-20(1-7). (4.5.6) 

This is easily checked. 

In the case 7 > \, we need to show that 



, a + 7(1 -a) (27- 1)(1 -a) , l + x 
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Now, if a = 0, then /(0, 7) = 0. The derivative is easily computed to be 



y ""-".'- l ~- + ^ + 27^ ln l±£_2(2 7 -l) i 



9a a + 7(1 — a) 1 — a x 3 1 — x 



x- 



This is positive since the first two terms are positive and the other two are bounded 
by 



^-l ln i±£_2(2 7 -l) > 2 1 - 1 |l ln l ± ,_ _ 



.7'" 



1 — X 



X- 



rr>£ I nr> T rf 

•Aj I %KJ J- tXj 



4.5.3 A depolarising channel & an amplitude-damping channel 

We now investigate the product-state capacity of a convex combination of an 
amplitude-damping and a depolarising channel. Let xi an d X2 denote the Holevo 
quantity of the amplitude-damping and depolarising channels respectively. They 




Figure 4.1: The Holevo \ quantity for the amplitude damping channel and the de- 
polarising channel plotted as a function of a in red and black respectively. 

are plotted in Figure |4~T1 for < 7, A < 1. The plot in Figure |4~T1 indicates that, for 
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certain values of 7 and A the maximiser for the amplitude-damping channel lies to 
the right of the intersection of Xi( a ) an d X2( a ) , whereas that for the depolarising 
channel lies to the left. Indeed, keeping A fixed, we can increase 7 until the max- 
imum of Xad(i) ues above the graph of x(^-x)- The two graphs then intersect at 
a value of a intermediate between | and the maximiser for Xad- This proves that 
the maximum of the minimum of the channels does not equal the minimum of the 
maximum of the channels. 



4.6 The periodic channel with amplitude-damping 



channel branches 



Recall that the amplitude-damping channel acting on the state p 
is given by 



( h ^ 

a 



\ 



b 1 — a 



a + (1-0)7 b\/l - 7 



\ 



(4.6.1) 



/ 



b^T^ (l-a)(l- 7 ) 
Recall also the expression for the product- state capacity of the amplitude-damping 



channel, 



X(<S>amp({Pj,Pj})) 



s 



I pj (aj + (1 - Oj)t) pAV( 1- 7) 
3 \ PAV(l-7) VjiX - aj){l - 1) j 

\ 



( 



J2Pi S 



a j + (1 - a j)l bjV 1 ~ 7 



. (4.6.2) 



j y bjy/i - 7 (!-%)(! -7) y 
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In Section I3T1 we argued that the Holevo quantity for the amplitude-damping chan- 
nel can be increased by replacing each pure state pj in the ensemble by itself and its 
mirror image, each with half the original probability. Let 



P = P'=\ ■ (4-6.3) 





We now investigate whether the following equation holds for a periodic channel 
with two amplitude-damping channel branches 

- sup \^2xi{{Pj,Pj}) ) = « J2\ sup Xi({Pj,Pj}) • (4-6.4) 

z {pj,pj} \ i=0 / i=o \fe>M / 

Note that, unlike the depolarising channel, the maximising ensemble for the am- 
plitude damping channel does not, in general, consist of orthogonal pure states. 
Instead the maximising ensemble depends on the value of the error parameter. 

Let 70 and 71 represent the error parameters for two amplitude-damping channels 
$0 an d 3>i respectively. We have argued that the Holevo quantity for the amplitude- 
damping channel can be increased using an ensemble containing two mirror image 
pure states each with probability ~. Using this minimal ensemble we investigate 
both sides of Equation (14.6.41) . for a periodic channel with two amplitude-damping 
channel branches. 

Since the sum of two amplitude damping channels is convex, the corresponding 
Holevo quantity is maximised for a single parameter. The left hand side of Equation 
(14.6.41) will therefore be attained for a single a max . 
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In other words, the maximising ensemble will contain two equiprobable states, and 
can be written as 



1 



sup 
{pj,Pj} 






1 



[X*(70,7l,« = GWr)] 



= 2 H bin((l ~ mO x)(l - To)) 

+ - H bin ((l - a max ){\ - 71)) (4.6.5) 

- U S (MPa m J) + S (^(pa^))} . (4.6.6) 



Note that given the eigenvalues for the amplitude-damping channel 



K mp ± = \ (l ± v / l-47(l-7)(l-a) 2 ) , (4.6.7) 



again, we denote 



x 



v/l-4 7 (l- 7 )(l-a) 2 . 



(4.6.8) 



Let x(7o, 7i, a ) denote the sum of the Holevo quantities of the two channels. The 
value for a max can be determined by solving the following equation 



dx(7o,7i,«) 
da 



+ 



+ 



+ 



1 
2 
1 
2 
1 
2 
1 
2 

0. 



27o(l-7o)(l-a) ln / 1 + gp 
x \ 1 - x 

27i(l-7i)(l-o). /l + xi 



;i- 7 o)ln 
;i-7i)ln 



In, 
Xl \1 — Xi 

(l-o)(l- 7o ) 



o+ (1 - a)7o 

(l-a)(l- 7 i) 

a+ (1 - 0)71 



(4.6.9) 
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The right hand side of Equation (14.6.41) cannot be obtained by a single a max . In- 
stead, the supremum for each channel will be attained at a different value of the 
input state parameter a. If we denote by a maxo and a maXl the state parameter that 
achieves the product-state capacity for the channels $ and $1 respectively, then 
the right hand side of Equation (14.6.41) can be written as 

^2 SUp Xi) = (X* (70, a maxo ) + X* (7l, Omaxx)) 

i=o yw>«} / 



V 
/ 

s 

\ 



(l-a maa!o )(l-7o) 

"mail "T~ V Q"max\)'~1l U 

(1 -a moa;i )(l -71) 

5($o(Pa_ ))+5($ 1 ( Pa _ i )). (4.6.10) 



Let Xo( a ) an( l Xi( a ) denote the Holevo quantities of the channels $ an d $i> re- 



spectively. Denoting x ,i = y 1 — 47 0i i (1 — 70,1) (1 — a 2 ), the values for a maxo 
and a maXl can be determined by separately solving the following two equations 



d Xo(a) , x _ /(l-a)(l-7o) 



da \ a + (1 — 0)70 

2 7o (l-7o)(l-a) fl + x 
+ In 



x \l-xo 



(4.6.11) 



73 



4.6. THE PERIODIC CHANNEL WITH AMPLITUDE-DAMPING 
CHANNEL BRANCHES 



dxi(a) _ , x _ , ]n f(l -a)(l-7i) 



da \ a + (1 — 0)71 

+ 2 7l (l-7 1 )(l-a) ln ^l + :r 1 



Xi \1 — X\ 

= 0. (4.6.12) 

Let xlvgiloi 7i> a mm , Omaxi) denote the average of the supremum of the Holevo 
capacities of the channels $ an d $i 



Xow 9 (70,7l»amaxo>Omaxi) = ^ (Xo( a ma*o) + Xl(Omaxi)) • (4.6.13) 



It is not difficult to show that 

X*(70 = l,7l» O = Xow 9 (70 = l,7l>Omaxo>amosi)- (4.6.14) 

Similarly, we can show that 

X \70? 7l *-i®"max) X.avg 170)71 -*- i ®maxo ) ®maxi ) ■ 

Next, we let one of the error parameters equal zero. Taking 7 = 0, the expression 

X*(7o, 7i» ^moi) becomes 

X WO " j 7l ) ^mox ) ^■bin\flmaxl) 

+ #6m((l - Omaxl)(l ~ 7l)) 

S^i(p amax ))- (4.6.15) 
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Denoting x* avg (lo, 7i, a maxo , a maXl ) by x* avg (7i) the right hand side becomes 

X* aV g (7l) = Hun{a m axl) 



+ HuniO- - a maxl )(l -71)) 

- S (^ ( PamaXl )) . (4.6.16) 



We will now show that 



X*(7o = 0,7i,o maa: ) < x* (70 = 0,7i,a maa;o ,a ma;Cl ). (4.6.17) 



Clearly, a maXl = ~. To show that a max . < a mrai , we must show that -£ J2i X*(°) < 

at a = dmaxx ■ 



In other words, we want to show that 



%^ + %^<0 (4.6.18) 

da da 



at a = ~. 



For 70 = the Holevo quantity of the channel $ becomes 



Xo{a) = S 



a 



(1-a) 



-S(p). (4.6.19) 



V 
But p is a pure state and therefore S(p) = 0. Therefore, from Equation (14.6.111) . 

%^ = In f 2-^T) . (4.6.20) 

da \ a J 
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We have previously shown that the maximising state parameter for the amplitude- 
damping channel is achieved at a > |. We are considering the case where 7o 7^ 7i, 
i.e. 71 7^ 0, therefore a maXl > ~. The expression Xo( a ) now represents the binary 
entropy, H(a), and is therefore maximised at a = ~. It was shown above that 
the entropy S(a) is a strictly concave function for 70 = and Xo( a ) 1S therefore 
decreasing at a = a maXl . 

The capacity x\ ( a ) is achieved at a = a maxi . Therefore x rj*' is equal to zero at 
this point. 

We can now conclude that -^ J2i Xi( a ) < when a = a maXl and therefore 

X*(7o = 0,-f!,a max ) < xlvgilo = 0,-fi,a maxo ,a maXl ). (4.6.21) 

We now show that an inequality exists between the expressions x*(lo, 7i, a ma x) 
and xlvgilo, 7i, a maxo , a maxi ) for fixed 7 , such that < 70 < 1. 

In Section 14.5.21 we proved that if 7 < 71, then xilo) > xili) an d therefore 

a ma x < a maXl . Therefore, ^^- < at a = a maXl and a max < a maXl . Similarly, if 
70 > 71, then a maxo > a maxi and ^§^- > at a = a maxi and a max > a maxi . 

As a result, a max will always lie in between a maxo and a maXl . We have previously 
shown that the Holevo quantity for the amplitude-damping channel is concave in 
its state parameter. Therefore a max > a, where a is the parameter value associated 
with Xlvgililh amax , a maxi ), Le - Ei su PaXi(a) =Xy om (a). This proves that 

X*{lO,ll, a max) < X*avg(inU a r. 
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In conclusion, if 70 = 1 or 71 = 1, then a max = a maxo or a max = a maXl respectively 
and x*(7o, 71, a max ) = xlvgil, 7i> «™xi, a maxi ). However, if 70, 71 7^ 1, then 
X*(7o, 71 > fl >««J < XL?(7, 7i > amoxi, OmaxJ. Therefore, in the case of a periodic 
channel with amplitude-damping channel branches 



1 

- sup 



^2xi({Pj,Pj}) ) ^nJ2[ sup ^(fe'^}) • (4.6.22) 

yi=o / i=o Vto'M / 



4.7 Summary 

In summary, we have investigated the classical capacity of two particular quantum 
channels with memory, namely a periodic quantum channel and a random quantum 
channel. We have shown that in both cases the product state capacity of each chan- 
nel is equal to its classical capacity. We can therefore conclude that entangled input 
state codewords do not enhance the classical capacity of these channels. 

Next we showed that the formula for the product- state capacity of the periodic chan- 
nel, which is given by the supremum of the average of the Holevo quantities for the 
channel branches, cannot be written as the average of the Holevo capacities eval- 
uated for each channel branch, when the channel branches consist of amplitude 
damping channels. This result has an important implication in the next chapter. 
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Chapter 5 



Strong converse to the channel 
coding theorem for a 
periodic quantum channel 



We introduce the channel coding theorem and concentrate, in particular, on the 
strong converse to the coding theorem for quantum channels. We discuss the fact 
that the strong converse does not hold for the product- state capacity of the peri- 
odic channel introduced in Chapter 4 and we demonstrate a, so-called, "weakened" 
strong converse for this channel. See [66] for a survey of quantum coding theorems. 



Remark 6. Wehner and Konig mn recently proved the fully general strong con- 
verse theorem for a family of channels, that is, they proved that the strong converse 
theorem holds for a family of quantum channels even in the case when entangled 
state inputs are allowed. 
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5.1 Coding theorems and quantum channels 

The channel coding theorem is comprised of two parts, namely the direct part of 
the theorem, which refers to the construction of the code, and the converse to the 
theorem. Shannon [7] proposed the theorem for classical channels and the first 
rigorous proof was provided by Feinstein [fT5l . 

The capacity of a quantum channel $ provides a limit on the amount of information 
which can be transmitted reliably per channel use. The direct part of the quantum 
channel coding theorem states that using n copies of the channel, we can code with 
exponentially small probability of error at a rate R = -log|.M| if and only if 
R < C, in the asymptotic limit, where M. denotes the set of possible codewords 
to be transmitted. If the rate at which classical information is transmitted over a 
quantum channel exceeds the capacity of the channel, i.e. if R > C, then the 
probability of decoding the information correctly goes to zero in the number of 
channel uses. This is known as the strong converse to the channel coding theorem. 

The strong converse to the channel coding theorem of a, so-called, classical quan- 
tum channel was proved independently by Winter [4] and by Ogawa and Nagaoka 
ll68Tl using different methods. We will follow the method used by Winter, namely, 
the method of types which was used by Wolfowitz ll69~ll . to prove the strong converse 
for classical channels. See H23U70I for an introduction to the method of types. 

We begin by introducing some notation. Recall that a memoryless channel is given 
by a completely positive trace-preserving map $ : 13(H) — > B(fC), where B(H) and 
B(1C) denote the states on the input and output Hilbert spaces % and /C, respectively. 
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Equivalently, we can describe a so-called classical-quantum channel, usually de- 
noted W, as a mapping from the classical message to the output state of the channel 
on B(JC) as follows, 

W : X i-> B(K), (5.1.1) 

where the message is first encoded into a sequence belonging the set X n , where X 
represents the input alphabet. The process is shown in Figure l5TTlll23l . 

We can combine the two mapping descriptions as follows. We wish to send classical 
information in the form of quantum states over a quantum channel $. A (discrete) 
memoryless quantum channel, $, carrying classical information can be thought of 
as a map from a (finite) set, or alphabet, X into B(K), taking each x E X to 
$ x = $(px), where the input state to the channel is given by {p x }xex an d each 
p x e B(H). Let d = dim(H) and a=\X\. 

For a probability distribution P on the input alphabet X, the average output state of 
a channel $ is given by 

Pa = Y,P(x)Hp*)- (5-1-2) 

The conditional von Neumann entropy of $ given P as defined by 

S(<S>\P) = J2 p (x)S(H Px )), (5.1.3) 

and the mutual information between the probability distribution P and the channel 
$ is defined as follows, 

I(P;$) = S(Pa)-S($\P), (5.1.4) 
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The product-state capacity of the channel $ is given by the maximum of the mutual 
information (Equation l5.1.4l) . taken over all possible probability distributions P, i.e. 



X*($) = max J(P;$). 



(5.1.5) 



An n-block code for a quantum channel $ is a pair (C n , E n ), where C n is a map- 
ping from a finite set of messages M., of length n, into X n , i.e. a sequence ,x n G A? 
is assigned to each of the \A4\ messages, and E n is a POVM, i.e. a quantum mea- 
surement, on the output space /C 0n of the channel $^,„ . This process is depicted in 
Figure 15.11 below. 
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Figure 5.1: Transmission over a classical-quantum channel. 



The maximum error probability of the code (C n , E n ) is defined as 



p e (C n , E n ) = max{l - Tr($^ j £™ ) :meM}. 



(n) 



(5.1.6) 



The code (C n , E n ) is called an (n, A)-code, if p e (C n , i? n ) < A. The maximum size 
\M\ of an (n, A)-code is denoted N(n, A). 
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5.2 Method of types 

Next we employ a technique known as the method of types [|70l to exploit the prop- 
erties of variance-typical sequences, leading to a sharp bound on the rate at which 
quantum information can be reliably transmitted over a memory less quantum chan- 
nel. This is achieved through the strong converse theorem. 

Define a finite alphabet X and sequences x n = x%, . . . , x n G X n and let 

N{x\x n ) = \{i e {l,...,n} :x t = x}\ (5.2.1) 

for a; G X. 

The type of the sequence x n is given by the empirical distribution P x n on X such 

that 

, N N(x\x n ) 
PAx) = ' • (5.2.2) 

n 

Clearly, the number of types is upper bounded by (n + l) a , where a = \X\. 

Now define the set of variance-typical sequences of length n and of approximate 
type P, for 5 > 0, as follows 



Jg 6 = { x n G X : Vx G X \N(x\x n ) - nP{x)\ < 5^^ P{x){l - P{x))}. 

(5.2.3) 
A set of type P (rather than approximate type P) is denoted Tp , i.e. 5 = 0. 
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5.3 Theorems and lemmas 



Next we state the channel coding theorem for memory less quantum channels. We 



obtain the theorem by combining the direct part, given by Theorem I5.2L and the 



strong converse, given by Theorem [53J The theorem is stated below. 

Theorem 5.1. (Coding theorem for memoryless quantum channels) 

For every A G (0, 1) there exists a constant K(X, a, d) such that for all memoryless 

quantum channels $, 

|logiV(n,A)-nx*($)| < K{\,a,d)y/n. (5.3.1) 

The direct part of the coding theorem for memoryless quantum channels is given 
by the following theorem. It was proved by Holevo [1J and Schumacher and West- 
moreland GO, who built of the ideas of Hausladen et al. Il33l . 



Theorem 5.2. (Code construction) 

Given e > 0, there exists n G N, such that for all n > n there exists N(n, e) = 
N n > 2 n (**(*)- £ ), and there exist product states p\ n \ . . . , pjjj e B{U® n ) and posi- 
tive operators E[ n \ ...,E ( £e £(/C® n ), such that J2^ =1 e£ } < I n and 

Tr ($W (pW) Jg?W) > 1 - e, (5.3.2) 

for each m. 

The following lemmas are required in order to prove the strong converse theorem 
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Theorem l5.3l 



Lemma [5. 3. 21 



Lemma l5.3.3l 



Lemma [5 .3 .41 



Lemma 15.3.71 



Lemma l5.3.1l 



Lemma [5.3.51 



Lemma l5.3.6l 



Figure 5.2: Map of the proof of Theorem [531 



for a memory less quantum channel (Theorem 15.31) . To make the proof more clear, 
we provide a "map" to the proof in Figure [531 



We start with some definitions. We first define the trace norm for an operator A as 
follows, 

\\A\\ 1 = Trv / AM. (5.3.3) 

If A is Hermitian, i.e. if A = A*, then 



\A\\ 1 =Tr\A\ 



(5.3.4) 



and the trace norm of A can therefore be written as the trace of the difference of 
projection operators Il + and II - , where II ± are the projections onto the eigenspace 
of A corresponding to all non-negative and all negative eigenvalues of A, i.e. 



\A\\ 1 = Tr (U + A - II- A) . 



(5.3.5) 
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More precisely, if A has spectral decomposition, 

d 
A = ^Aj|iij)(uj|, (5.3.6) 

i=i 

and therefore 

d 

l^l=X]|Ai|K)(«i|, (5.3.7) 

i=i 

then the projections n^ can be expressed as follows 

n+ = 5Zl u *>< u i| n " = J] l«i)(«il- ( 5 - 3 - 8 ) 

Ai>0 Ai<0 

Next, for a state p we choose a diagonalisation 

p = ^ J R(j)7r,. (5.3.9) 

j 

Clearly the list of eigenvalues R(j) form a probability distribution and therefore 
S(p) = H(R), where S(-) represents the von Neumann entropy of a state and H(-) 
represents the Shannon entropy of a probability distribution. 

We can now define the variance-typical projector of the state p with constant 5 > 
as follows, 

K,s= E *ii®-"® Tin- (5.3.10) 

An operator < 5 < 1, is said to be an 77- shadow of the state p if 

Tr^oS)^??. (5.3.11) 
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The following lemma provides bounds on an operator A with the constraint < 
A < 1. These bounds will be used to prove subsequent lemmas. 

Lemma 5.3.1. Let < A < 1 and p a state commuting with A such that for some 
X, Hi, H2 > 0, and let the following relations hold 

Tr(pA) > 1 - A and p ± A < ^Ap\fA < p 2 A. (5.3.12) 

Then we obtain the following bounds 

(1 - AK 1 < Tr(A) < p- x \ (5.3.13) 

and for an rj-shadow B of p, 

Tr(B)>(r ] -X)p 2 1 . (5.3.14) 

Proof. We first show that (1 — X)^ 1 — Tr(A) < p^ 1 . Using the inequalities 
Tr(pA) > 1 - A and \fKp\fX < p 2 A, 

Tr(A)>p 2 1 (l-X). (5.3.15) 

Next, using Tr(pA) < 1 and Tr(pA) > piTr(A), 

Tr(A) < pi 1 . (5.3.16) 
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Finally, we show Equation (15.3.141) . as follows 

p 2 Tr(5) > Tr(p 2 AB) 

> Tr(VXpVXB) 

= Tr(pB) - Tr((p - VXpVX)B) 

> Tf - \\p - VXpVX^ 
= V - (Tr(p) - Tr(pA)) 

> rj-\. (5.3.17) 

The first inequality above is due to A < 1, the second one is due to // 2 A > vAp\/A. 
The next inequality holds since Tr (pi?) > i] and < B < 1 and the final one is by 
Tr(pA) > 1 - A. □ 

Next define K = 2(^1). When calculating P x n(x) (Equation 15X21) for a par- 
ticular symbol x we are interested in whether or not the symbol x appears in the 
sequence x". We therefore define Bernoulli random variables X,- L taking the value 1 
if and only if x, = x, with probability P(x). 

Lemma 5.3.2. For every state p and positive integer n, the following three inequal- 
ities hold. 

Firstly, the probability that the state p®" is typical, with respect to the set of variance 
typical sequences T^ s , is given by 

Tr(p^ir; )5 ) > 1 - A (5.3.18) 
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Secondly, the following relation holds, 

n™ 5 p® n n™ 5 > Y[^ 2 {-nS{p)~KSd^i)^ (5.3.19) 

The above bound will be used directly to prove both Lemma 15.3.31 and the strong 
converse theorem. 

Finally, the size of the projection II™ 5 is upper bounded as follows, 

Tr(n™ 5 ) < 2 {nS{ - p)+Kd& ^ l) . (5.3.20) 

Proof. Observe that, 

Tr(p® n n^) = R® n (7% s ). (5.3.21) 

Chebyshev's inequality states that, 



P{\X-fi\ >ka) <-L, (5.3.22) 



where X is a random variable, // is the associated mean, k > and o is the variance 
of X with respect to \i. 

The set of variance typical sequences T£ s (Equation (15.2.31) ) is the intersection of 
d = dim('H) events. For each j (appearing in the sum in Equation (15.3.91) ). the 

N(j\j n ) 

random variable X = — - — = - YH=i $j,ji must deviate from its expectation 
R(j) by at most VW-*^ by definition of T£ s . 
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Using Chebyshev's inequality, 



N(j\f 



n 



R(j) 



> 



6y/Rti)(l-R(j)) \ 1_ 

v^ ) - s 2 ' 



(5.3.23) 



for a given j. Therefore, the union of d such events is less than or equal to j|. But 
7^ s is the intersection of the complementary events, therefore 



P m (Vl, s ) > i 



s 2 



(5.3.24) 



Let 7r n = 7ij 1 (g) • • ■ 7Tj n be one of the eigenprojections of the tensor product state p 5 
constituting II™ s . Then, 



Tr(p^) = R( 3l ) . . . R{ ]n ) = J] R(j) N ^ n \ 

3=1 



(5.3.25) 



and given \N(j\j n ) ~ nR(j)\ < 6y/Ey/R(j){l - R(j)), 



\ogTi(p mi iT n )-nS(p)\ 



J2-N(j\j n )logR(j)-nS(p) 

3 

J2 -N(j\f) log R(j) + nR(j) logR(j) 



< 



< 



Y,log R(j) 



3=1 



N(j\f)logR(j)-nR(j) 



J2-SV^VR(jjlogR(j) 



3=1 



= -28^Y,^J)^g^/R(J) 

3=1 

< Kd5y/n, (5.3.26) 
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since —2^R(j) log \/R(j) < K. Therefore 

2(-nS(p)-KdS^n) <- r £ T (p ( ^ n 7r n \ < 2^ nS ^ +Kd& ^> (5 3 27) 



We have n™^ = 2~2j"eT n . tt^®- • -<S>7rj„ andp = V. R(j)-Kj, therefore using Equa- 
tion (15.3.251) and the lower bound on Tr(p® n vr n ) i.e. Tr(p® n vr n ) > 2 { -- nS ^~ KdS ^\ 
we obtain the following, 

ny»n» = £ (*■* ® • ■ ■ ® *■*) p®" £ fe ® • • • ® *, J 

= J2 ^(^ n )fe®---®%J 

J fc/ B,(5 

= n n 2(~ n5 ^~' K ' d ' 5v ^. (5.3.28) 

Therefore by Lemma I537T1 (taking \x x = 2- nS ^~ Kd5 ^), 

Tr(n^) < 2 (nS ^ +K5 ^\ (5.3.29) 



D 



We now fix diagonalisations Q x = J2j=i QU\ X ) {^x)jt where Q(-\-) is a 
stochastic matrix, and define the conditional variance-typical projector of $ 
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given x n with constant 5 to be 

rr^(x") = (g)n^ 5 (5.3.30) 

where I x = {i £ {1, . . . , n} : Xi = x}. With $J^ = $ X1 <g> $ X2 <8> • • • <8> $ x „, we 
then have the following lemma, 

Lemma 5.3.3. For all x n £ X n of type P, the probability that the output state 
$j.n is typical with respect to the conditional variance-typical sequences Tq% is 
bounded below as follows 

Tr($^n^(x"))>l-^, (5.3.31) 

and with Xl n = 11^ s (x n ) the following inequality holds, and will subsequently be 
used to prove Lemma \5.3.4\ and the strong converse theorem, 

rr^rr < u n 2 ( - nS ^ p)+K5dV ^ . (5.3.32) 

In) 

Every rj-shadow B of& x n satisfies, 

Tr(B) >L-^) 2 (nS ^- Kd ^ 5 ^. (5.3.33) 

Proof. The first inequality (15.3.311) . is obtained by applying Lemma [5.3.21 a times. 
The inequality given by expression (15.3.331) follows from the inequalities (15.3.311) 
and (15.3.321) . using Lemma [5.3.11 Next we prove the inequality (15.3.321) . Using 
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n^ n ) = <g>, G *II^,wehave, 



rr^rr = (g)n^ $f- (gK^ 



(g) £ (^■ 1 ®---®7r jJ J$f- £ (x,-,®--.®^) 

J2 QUi\x)---Q(jiJx)(7i n ^---^n JIx ) 
xeXji xeT £ 5 

d 






i 



but, since j /:c e 7^, and Q(j|x) < 1 



iVO'l/") > \W(j\x) - 5y/\irWQ(j\x), (5.3.35) 



and therefore 



ir<i>i n Jrr < (g) E n^0» (|7xl9(j|a:) " 5 ^ v/ ^ 7 ^ ) ( 7r ^ 



7T 



JW" 



xeA' i /, er ^i=i 



(5.3.36) 
Recall that we are assuming x n to be of type P, therefore \I X \ — nP(x) and 

d d 

xex j=i xex j=i 

= ^ 2 - nP W 5 (*-), (5.3.37) 



xex 



and 



28 y/^y/Q{j^) log y/Q{j^) <2 Kd8y/\l7\_ (5.3.38) 
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Using the Cauchy-Schwartz inequality 

|(o|6)|<||o|| 2 .||6||2 (5.3.39) 



i.e. (J2 k a kbk) < (J2k a t) (Sfc b t) il is clear tnat J2 x <=x \/Vx\ < y/an and there- 
fore, 



rpt&^n™ < Y^ V^ 2~ nP{ - x ' ,s(q ' x ' ,+KdS y^ 1 ^ 

xeX j I 'eTg t 
< Yi n 2~ nS ( q '\ p } +KdS ^° Fl (5.3.40) 



as required. 



Now we can use Lemma l5~.3.U taking A = ITJ s (x n ) and p = $^. We can therefore 
conclude, using Tr(£>) > [r] — A)/!^ 1 from Lemma l5.3.1l that 

Tr(£) > L - ^) 2 ^^- Kd ^^\ (5.3.41) 

D 

Lemma 5.3.4. Let x n be of type P. Then, the probability that the output state $^» 
is typical with respect to the variance typical projection IIp CT g ^ is lower bounded 
as follows, 

Tr (^n^) > 1 - f. (5.3.42) 

Proof. First diagonalise 

d 

p a = ^2q j 7f j (5.3.43) 

i=i 
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and define the CPT map * : B(U) H> B(K) by, 






(5.3.44) 



We now show the following, 



Hpff,6y/a — ^$>,s( x ) 



(5.3.45) 



Let 7f n = 7f 



ji 



7T,-„ be one of the product states comprising 



tf$ x = ^ g (j| 



XJ7T,- 



i=i 



and consequently 



'R-v&,6\ x ) ~~ Qyn^ $a;i( 5 



rrGA" 



(5.3.46) 



(5.3.47) 



and 



N(j\r") ~ \Ix\0j\x 



< Sy/\I x \Jqj\ x (l - qj\ x ) 



(5.3.48) 



Since x n is of type P we have \I X \ = nP(x) and qj = J2 xeX P(x)qj\ x , then 



\N(j\j n )-n qj \ < J2 



xex 



N{j\j Ix ) - \ix\<n\x 



< Y^ 5 Vn\fP{x)Jqj\ x {l - q j{x ) 



xex 



< d^fn^fa 1^2 P(x)q j \ x (l - qj\ x ) 



xex 



< Sy/ny/aJqjil-qj), 



(5.3.49) 
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using the Cauchy-Schwartz inequality and concavity of x i-» x(l — x) and qj = 

We can conclude that n n = n^ <g> • • • <E> 7Tj„ contributes to ITp^ 5 ^ and therefore 
nj ff)S>/s > n™ $(5 (a; n ), as required. 

Using the definition of the CPT map \& given by Equation (15.3.441) and the trace 
preserving property of quantum channels, we obtain the following 



= Tr (I]^ ® • • • ® %n) ^5 (%X ® • • • ® %Jn£ MV5 



**> 



ad 
> 1-— , (5.3.50) 



where the first inequality is by Equation l5.3.45l and the final inequality is by Lemma 
15331 □ 

Next we introduce two fidelity lemmas. In the following p is taken to be a state pure 
and a may be a mixed state. The trace norm distance D(p, a) is defined for p and a 
as follows 

D{p,a) = l -\\p-a\\ v (5.3.51) 

and the pure state fidelity 

F(p,a)=Tr(pa). (5.3.52) 
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Lemma 5.3.5. Let p = \ip)(ip\ and a = \(j>){<j>\ be pure states. Then 



Proof. Take \tp) = |0) 



1. Therefore 



l-F(p,a) = D(p,a) 2 . 



and |0) = a \0)+(3\l) 



I, ^ 



V° V 



a 



\a\ 2 a(3 
«/? |/?| 2 



a 



J 



(5.3.53) 



where lal 2 + 



The fidelity F(p, a) can now be calculated, 



(5.3.54) 



F(p,a) = Tr(pa) = \a\ 2 . 



(5.3.55) 



The trace distance, D(p, a) 
using | a | 



2 I \p ~ °\ I i = |TrA/(p — cr)*(p — a) and therefore, 



12 ' \B\ 2 = h 



and therefore 



D(p,<r) = \j3\, 



l-F(p,a) = D{p,a) 



(5.3.56) 

(5.3.57) 
D 



In the following lemma we relax the assumption that both states a and p must be 
pure states. 
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Lemma 5.3.6. Let a be any arbitrary mixed state, and p a pure state. Then 

D(p,a) 2 <l-F(p,a). (5.3.58) 

Proof. Write a = ^ ■ qjitj, where iij are pure states. Then, using Lemma [5.3.5L 



i 



> D(p,a) 2 . (5.3.59) 

The first inequality above is due to the convexity of f(x) = x 2 . The second inequal- 
ity is due to the convexity of D(p, a) (triangle inequality). □ 

Informally, the following lemma states that, under the trace norm (defined by Equa- 
tion (15.3.31) ). the state p is disturbed by at most \/8\ by the operator X, provided 
that 1 — Tr(pX) < A < 1. In the proof of the strong converse theorem, the positive 
operator X above will be replaced by the projector onto the typical subspace for 

(n) 

Pa, denoted 11™^ s ^, and the state p will be replaced by the output state & x n . In 
this case the above lemma takes on the following important interpretation. If the 
probability that the output state is not typical (with respect to the typical subspace 
Pa) is less than A, i.e. if 1 — Tt(^^ IT^ a s ^) < A, then under the trace norm, the 
state $^" is disturbed by at most y/8X, when projected onto the typical subspace 
for the average output state Pa. 
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Lemma 5.3.7. Let p be a state and X a positive operator with X < 1 and 
1 - Tr(pX) < A < 1. Then 



\\p - VX p VX\\ 1 < V8X. (5.3.60) 

Proof. Let Y = \/X and p = ^2 k Pk^k, where n k are pure states and p k > 0. We 
then have the following 

\\p-YpY\W < (j^PkW^k-YTTkYWA 

- ^2Pk\\^k~Y'K k Y\\ 1 

k 

< 4^p fc (l-Tr(7r fc r7r fc r)) 

k 

< 8^p fc (l-Tr(7r fe F)) 

k 

< 8(1 - Tr(pY)) < 8A. (5.3.61) 
The first inequality is by the triangle inequality, 

ll^ + l/Hi < IML + ||2/||i- (5.3.62) 

The second is due to the convexity of x H- x 2 and the third is due to Lemma l5.3.6l 
The next inequality is shown as follows. Since n k is a projection n k = (n k ) 2 , and 

1 - Tr(ir k Yir k Y) = Tr(n k - n k Yn k Y) 

= Tr(7r fc - n k Y) + Tr(7r fe F - n k Yn k Y) 

= l-Tr(TT k Y) + Tr(n k Ynk(nk-n k Y)). (5.3.63) 
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But ||Y|| < 1 and therefore Tr^F^^ — -K k Y)) < Tr(7Tfc — n k Y) and we have 
1 - Tr(n k Yir k Y) < 2(1 - Tr(7r fc F)). (5.3.64) 

The final inequality in (15.3.611) uses Y > X and 1 - Tr(pX) < A. □ 

Next we state and prove the strong converse theorem for memoryless quantum chan- 
nels (Winter) @). 

5.3.1 Strong converse for a memoryless quantum channel 

Theorem 5.3. (Strong converse for memoryless quantum channels) 

For A G (0, 1) there exits a constant K(X, a, d) such that for every quantum channel 

$ and (n, A)-code 

1^1 < 2 (™x*(*)+^(a,M)v^). (5.3.65) 

Proof. Note that if we assume all codewords to be of the same type P, then we 
may tighten the above bound. We first show that the number of codewords is upper 
bounded as follows 



\M P \ < -^ 2 (" / ( Pi *) +2if ( AA(| )>) (5.3.66) 

1 — A 



taking 5 = vszad ^ j^ Q theorem then follows since there are at most (n+ l) a possible 
types, where a is the length of the alphabet X, i.e. a = \X\. We will demonstrate 
this once we have proved expression (I5.3.66|) . 
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First take an (n, A)-code with the decoding operators E^, i.e. 

Tr{QM(j&)E%)>l-\, (5.3.67) 

where & n \pm) is the output from a memory less quantum channel acting on the 
input state pm and Tr($( n ) (p™ )£^) is the probability of successful decoding. 

To prove inequality (15.3.651) we now construct the following new decoding operators 

Ei = n^ El U^ 5VE , (5.3.68) 

where Y\ n Pa 8 ^ is the projection onto the typical subspace for Pa, the average output 
state of the channel $. 

Let C£ = C"{m) = x n , for m E M. Then, (C£ , E%) is an (n, *±*)-code, since 
for m E {1, . . . iV„} the probability Tr ( & n \pm)E^ ) can be written as follows, 



Tr($(")(pW)^'j = Tr($W(pW)^) 

> i - a - | |$ w (p^) - nL,v^ (n) (^ n) ) n Wl li 



;8ad 
> 1-A-1/-JT 

^— . (5.3.69) 



The first inequality holds since (C™ , -E 1 ^) is assumed to be an (n, A)-code and since 
E™ < 1. The second inequality uses Lemma 15.3.41 and Lemma 15.3.71 as follows. 
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Lemma 15.3 .71 states that, for 1 — Tr(pX) < A' < 1, the following inequality holds 

\\p- VX P VX\\ 1 < V8X. (5.3.70) 

In our case, we put X = 11™^ & r^ and p = $<") (pin ) and by Lemma [5 .3 .41 

l-TY(^Vl n) )n^)<|f. (5.3.71) 



The second inequality then follows from Lemma 15 .3 .7[ taking A' = || . The code 



(C™ , E^) is therefore an (n, ^^), since 



VeiC^Et) = l-Tr(Qto(pg>)E£ 



< l ~^A- (5.3.72) 



Since Tr (^ n \p^)E%\ > ±=*, the operator E* is an r\ = ±=^ shadow of 
< £ <n - ) (pm ) (see Equation (15.3.111) for definition of r] shadow) and by Lemma l5.3.3l 



1-A 



Tr (e£) > — 2 {smp)-KdV-a5^), (53J3) 

The sequence x n is of type P so we can use Lemma [5.3.21 as follows, 

J2 Tr Kn < Tr n^^/5 ^ 2 {n5(PCT)+ ^^^ ) . (5.3.74) 



meA^ 



Therefore, 

1 7W I Tr i^"' < 2( nS ( PaS)+Kd ^ s ^ (5.3.75) 
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and using Equation (|5.3.73l) we obtain an upper bound on the number of codewords, 



1 — A 



and using the defintion of mutual information (Equation 15 .1.41) 



\M\ < 4 2 ( - nI ( p '^ +2Kd ^ Es ^ (5.3.77) 

1 — A 



using I(P; $) = S(Pa) - S($\P). This proves Equation (15.3.661) . 

Next we make use of the fact that the number of types is upper bounded by (1 + n) a 
where a is the length of the input alphabet. This result has the interpretation that the 
number of types increase only polynomially with n. 

With the additional assumption that all the codewords are of the same type, we have 

y^\Mp\ < k y^ 2 {ni{p ^ )+2KdvEsvE) 
p p 

= k(l + n) a 2( n/ ( p ;*)+ 2 - ft ' d vW^) 

< k( y l + n) a 2 {nx *^ )+2KdVE5VK) (5.3.78) 

where, k = j^t. Clearly, 2^ 2Kd ^ Es ^> dominates the k (1 + n) a term for n large, i.e. 

k(l + n) a < 2 (c ^. (5.3.79) 

Therefore 

J2 \Mp\ < 2^'^ +2Kd ^ s ^\ (5.3.80) 
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□ 



5.4 Coding theorem for a periodic quantum channel 

Recall that a periodic channel acting on an n-fold density operator can be written in 
the following form 

L-l 

$(«) ( p W) = J2 (*< ® ^+1 ® • • • ® ^i+n-i) (P (n) ) , (5-4.1) 

i=0 

where <£>« are CPT maps and the index is cyclic modulo the period L. The Holevo 
quantity for the i-th branch of the channel is denoted \% ( {Pj i Pj})- The product-state 
capacity of the channel (15.4.11) is given by 

1 L_1 

C p ($) = ~ sup 22xi({Pj,Pj})- (5-4.2) 

L {pj,Pj} i=0 

The proof for the direct part of this theorem is provided in Appendix B. 

We show that the strong converse theorem does not hold for the above expression. 
As a consequence it does not provide a sharp upper-bound on the rate at which 
classical information can be transmitted over the periodic channel. 

Recall that the strong converse for a memory less quantum channel $ states that 

log 2 \M\< n X *($) + K^i. (5.4.3) 

The strong converse for the periodic quantum channel does not hold since the ca- 
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pacity of the channel, C p , is upper bounded as follows, 

C p < C p , (5.4.4) 



where, 

1 L_1 
C p = -^2 sup Xi({Pj, Pi})- (5.4.5) 

However, each branch of the periodic channel ^ n ' can be written as a memory less 
channel of dimension d! = dL as follows, 



($ i ®$ i+1 ®---®$ l+L _ 1 )®r. (5.4.6) 

Remark 7. Note that equality for expression A5.4.4D can be shown to hold for the 



depolarising channel and it is shown in Section vLm that a strict inequality holds for 
the amplitude-damping channel. 

Since we are limited to using product-state inputs, the product-state capacity of each 
channel branch is additive and therefore equal, i.e. 

n Ti 

X*($o®$i®---®$L-i)® r = £(x*($o) + X*($i) + --- + X*($l-i)) 

L-l 

= ££XW (5-4-7) 

j=0 

Remark 8. Notice that, like Equation d5.4.7l) . C p (Equation d5.4.5D ) is the average 
of the Holevo capacities of the L channel branches. Therefore, if we knew in ad- 
vance which channel branch will be chosen, then we could take the rate R = C p 
with the probability of error p e < e and the strong converse would immediately fol- 
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low, using Theorem 15.31 However, we do not have this additional information. We 
compromise by assuming to know the channel branch in advance and then compen- 
sate for this assumption by taking p e < \ + e. But, the rate R = C p could be too 
high to take for certain channel branches, (branches consisting of the amplitude 
damping channels, for example). We therefore must choose a rate in between C p 
and C p with p e < j + e. 

We choose a rate in between the two expressions above and we demonstrate the 
strong converse using this "compromised" rate. We refer to this result as the "weak- 
ened" strong converse to the coding theorem for the periodic quantum channel. 

More precisely, we choose a rate R such that 

C p < R < C p . (5.4.8) 

The direct part of the theorem, i.e. with rate below the upper bound, may be argued 
as follows. 

We choose a particular channel branch, say, i = I and simply take the code 
(C n , E n ) for this (memoryless) product channel, 

($i <g> $ 2 ® ■ ■ ■ ® $ L )® n , (5.4.9) 

with rate R < C p . However, we must pay a penalty (thereby diluting the theorem 
and reducing the theorem for the periodic channel to that of a single branch) for 
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fixing a particular branch and therefore must construct the code such that, 



Pe<-jr + e, (5.4.10) 



where \ is the probability of choosing a particular branch. 

Using Theorem 15.31 we now argue that the strong converse holds for the rate given 
by Equation (15.4.81) . Let (C n , E n ) be an (n, A)-code. Therefore 



L-l 

l 

Pe 



1 L_1 

f X>5<A, (5.4.11) 



L 

i=0 



where p e denotes the average probability of error for the periodic channel and p\ 
denotes the probability of error for the i-th channel branch. Therefore we can apply 



Theorem E3 as follows 



log 2 \M.i\ < nC p + K{\,a,d)\/n 



L-l 

n 



L {Vj^j}'i-._ ,, 



sup ^2xi({Pj,Pj}) + K(X,a,d)Vn. (5.4.12) 

{Vj,Pj} i=( 

Also, since p e < A, 3i such that p l e < A, 



log 2 \M t \ < C p ($f L ) + K(\,a,d)V^ 
n 



L-l 

j- J2 X*{$i) + K(X, a, d)v^, (5-4.13) 

j=0 



where, 

$f E = ($. ® $ t+1 g, . . . <g, $ i+L _ x )^. (5.4.14) 
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We can now conclude that, although the strong converse theorem does not hold for 
the product-state capacity of the periodic channel defined by Equation (15.4.11) . the 
"weakened " strong converse does hold, as argued above. 



5.5 Summary 

We introduced the method of types iTVOll and provided Winter's proof [4 J of the 
strong converse theorem for memory less quantum channels, updating the notation 
and providing detailed proofs for the lemmas used to prove the theorem. 

Next we considered the strong converse theorem for the periodic quantum channel 
introduced in Chapter 4 and showed that the strong converse does not hold for this 
channel. This conclusion is drawn based on a result shown in Chapter 4, namely 
that due to the fact that the average and the supremum cannot be interchanged in the 
formula for calculating the product state capacity of the periodic channel with the 
amplitude damping channel branches, this formula cannot be re-written in a way 
which would lead to a direct application of the strong converse theorem. 

We do show, however, that if we weaken the scenario by assuming to know that the 
channel chosen is known in advance, then the strong converse theorem does hold. 

The direct part of the channel coding theorem for the periodic quantum channel is 
provided in Appendix lB.il 
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Appendix A 



Caratheodory's Theorem & an 
application to minimal 
optimal-ensembles 



A.l Caratheodory's Theorem 

Caratheodory's theorem is stated as follows 

Theorem A.l. Let S C M. d be a set. Then every point x in the convex hull ofS can 
be represented as a convex combination ofd+1 points from S, i.e. 



x = 

i=0 



^ OiXj, {AAA) 
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where x\ G S, on > and J2i a i = 1- 



A.2 Application 

Next we prove that ensembles containing at most <i 2 pure states are sufficient to 
maximise the Holevo quantity of a CPT map. This proof was provided by N. Datta 
irTTI . We first prove that d 2 + 1 states are sufficient, using Caratheodory's theorem 

ram 

Proof. Note that the set of density operators is described by d 2 — 1 parameters. Let 
f(p) = (A(p), • • ■ , U^p), f d2 (p)) = (fi(p), ■ ■ • , U-i(p), S($(p)), (A.2.1) 

be the vector- valued function, with the first d 2 — 1 components corresponding to the 
linear degrees of freedom of p, for every density operator p. 

Consider the set of images f(V) C M d of pure states V . To every ensemble of pure 
states £ := {g^, l^tXV'tl}* we can associate the point 

i 

in the convex hull of f(V). Moreover, the Holevo quantity 

X (S) = S ( $>*(llk>M) ) - ^qiSWmm) (A.2.3) 
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is a function of this vector only, i.e. 

X (S) = G{fs) (A.2.4) 

for some function G. 

To see this note that the average input state £\ g^i) (V^l an d tnus tne corresponding 
entropy are completely specified by the first d 2 — 1 components of |*0j)(^j| because 
of linearity of the function /(•), and J2i Qi S(\4>i) (V'tl) i s me l ast entry of /p. 

— * 

Now, let S be an ensemble maximising x(£), then by Caratheodory's theorem, f £ 

can be represented as 

d 2 -i 

/1 = ^^/(|0,)(0 4 |)- (A.2.5) 

We then define the pure state ensemble £' := {p i: \4>i) ((J)^}^ 1 which consists of 
only d 2 + 1 pure states. By definition , 

fe = fe>, (A.2.6) 

hence, x(£') = x(^)> as desired. 

The above proof shows that d 2 + 1 states are sufficient. The stronger statement 
can be obtained using a strengthening of Caratheodory's theorem by Fenchel and 
Eggleston ( It35ll . Theorem 18). 

This states that if S C M. m is the union of at most m connected subsets, then every 
x in the convex hull of S can be represented as a convex combination of at most m 
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points in S. 

Since / is continuous and the set of pure states V is a compact, connected set, the 
image f(V) C M. d is also compact and connected. Hence the union f(V) is the 
union of only one connected set and we obtain the desired result. □ 

The above result, that it is sufficient to consider ensembles containing just d 2 pure 
states when maximising the Holevo quantity, was first shown by Davies Il34l . Note 
that his proof also utilises Caratheodory's Theorem (Theorem lA.il) . 
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Appendix B 



Product-state capacity of a periodic 
quantum channel 



B.l Proof of the product-state capacity of a periodic 
quantum channel 

The following proof is a special case of the proof, given by Datta and Dorlas in Q, 
for the product- state capacity of a channel with arbitrary Markovian noise correla- 
tions. 
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B.2 Preliminaries 

A general quantum channel is given by completely positive trace-preserving (CPT) 
maps $ (n) : B(H® n ) -»- B(K® n ), where H and /C are the input and output Hilbert 
spaces of the channel. Here we consider a periodic channel of the following form 

1 L_1 
$(»)( p (»)) = ^($. g, $. +1 g, . . . «g, ^.^.^(pW), (B.2.1) 

i=0 

where we assume that a set of CPT maps $j : 13(71) — >■ B(/C) (i = 0, . . . , L — 1) is 
given, and the index is cyclic modulo the period L. 

If we denote the Holevo quantity for the z-th branch by Xh i- e - 

Xi({Pj,Pj}) = S f J^PjHPj) J ~ ^PjSiMPj))* 

then we shall prove that the product capacity of the channel (IB.2.11) is given by 

1 L_1 
C p ($) = sup jJ2xi({Pj,Pj})- (B.2.2) 

tew} L i=0 



B.3 The Quantum Feinstein Lemma 

The direct part of the theorem follows from 

Theorem B.l. Given e > 0, ?/zere exists n G N j'mc/z that for all n > n there 
exists N n > 2 n ( c '(*) _e ) and r/zere exist product states pf , . . . , pjy G S(7l® n ) and 
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positive operators E[ n) , . . . , E^] e B(K® n ) such that J2k=i #k° < J and 

Tr$W (4 n) ) 4 n) > 1 - e, (B.3.1) 

for each k. 

Proof. We first construct a preamble to the code which serves to identify the first 
branch i chosen. To distinguish the initial branch, notice first of all that the cor- 
responding CPT maps $j need not all be distinct! However, we may assume that 
there is no internal periodicity of these maps; otherwise the channel be contracted 
to a single such period. This means, that for any two states i, i' e {0, . . . , L — 1} 
(i < i') there exists k < L — 1 such that $ i+fe ^ $?'+&. Then choose u = u iti > such 
that 

/ := F($ l+k (u), $,, +fc (w)) < 1. (B.3.2) 

in) 

In the following we write <§>\ for the branch of the channel with initial state i, i.e. 

$i n) (p (n) ) = (^ ® $m ® • • • ® $,+n-i)(p (n) ). (B.3.3) 

Lemma B.3.1. For any < i < i' < L — 1, let u be a state as above. Then 

F ^ mL \u 9mL ), 4 mL V® mL )) -)• (B.3.4) 

as m — > oo. 
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Proof. 



F ( $J mL) (w® mL ), <4 mL) (w 0mL> 



< [F($ i+k (u), ^ +k (u:))} m = f m -> 0. (B.3.5) 



D 



We now introduce, for any pair of states a, a' on /C, and 7, 7' > 0, the difference 
operators 

Let II ± be the orthogonal projections onto the eigenspaces of AS a J corresponding 
to all non-negative, and all negative eigenvalues, respectively. In [3] we proved the 
following lemma 

Lemma B.3.2. Suppose that for a given 5 > 0, 

|Tr[|^, ) |]-( 7 + 7 / )l<^ (B.3.7) 



Then 

lTrrn + (rr^ M l-1l < 

2 7 



Tr[n + (a)® M ]-l| < — (B.3.8) 



and 

lTr[n-rrr'^ M l-1l < 

2y 



Ti[ir(a'f M ] - 1| < — • (B.3.9) 



To compare the outputs of all the different branches of the channel, we define pro- 
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jections IT on the tensor product space ® 0<i<i , <L £® M = K® Mh2 with L 2 = (J) 
as follows 



n, 



<i) 



(B.3.10) 



0<h<i 2 <L 



where, 



,0 



I d M if i\ ^ i and i 2 ^ i 
nr , if z 2 = i 
nja if *1 = *■ 



(B.3.11) 



Notice that it follows from the fact that IL^TIj {l = 0, that the projections IT are 
also disjoint, 

fiifii/ = for i^i'. (B.3.12) 

It now follows easily with the help of the previous lemma and the inequalities lfl4ll 

Tr(A 1 )+Tr{A 2 )-2F{A 1 ,A 2 ) < \\A X - A^ < Tr(A 1 ) + Tr{A 2 ) (B.3.13) 

for any two positive operators A\ and A 2 , that these projections distinguish the 
relevant initial branches. Indeed, if we introduce the corresponding preamble state 



UJ 



(ML 2 ) 



, ,®M 
U h,i2 I ' 



(B.3.14) 



\H<i2 



then we have 



Lemma B.3.3. For all i G {0, . . . , L — 1}, 



lim Tr 

M-too 



nM ML2) (^ ML ^) 



(B.3.15) 
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In the following we fix M so large that 



Tr 



fii $f ML2 (J ML2) ) 



>l-5 (B.3.16) 



for all i G {0, . . . , L — 1}. We also assume that M is a multiple of L so that IB .3. l~l 
applies. The product state u^ ML2 \ defined through IB .3. 141 is used as a preamble 
to the input state encoding each message, and serves to distinguish between the 
different branches, $' ■ , of the channel. If p[ e B{H® n ) is a state encoding the 
k th classical message in the set M n , then the k th codeword is given by the product 
state 

Note that, since M is a multiple of L, the index of the first channel branch applying 
to pk is also i. 

Continuing with the proof of Theorem IB. 11 let the maximum of the mean Holevo 



_ \ ■x/ . V\c% ottoinan Tr\f on c%n cc*n-\\-\\ a ) m . r\ . L " 



J J' 



quantity j Ei=o Xi be attained for an ensemble {pj, pj}j =1 . Denote oi j = $j(p 



Choose 5 > 0. We will relate S to e at a later stage. Consider the typical subspaces 

i n \ _ 

7~ e of /C ™, with projection P ijTl such that if Oi has a spectral decomposition 



o-i = ^J A*,* |-0»,jfc> <-0*,aj I (B.3.17) 
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then if k = (kx,...,k n ), \ip iM ) 



m,kj e T it( if and only if 



1 n 

- y^ log \ k + s(&i 



i=i 



e 

<4- 



(B.3.18) 



Then, for n large enough, 



Tr(P hn af n )>l-5 2 . 



(B.3.19) 



For any given initial index i, we let V { e be the subspace of /C 0n spanned by the vec- 
tors |V>i,fa)®l^<+i,fa)®l^i+«-i,0> where IV , »,*i)®l^i I fc£ + i)®- • •®IV , i,* Kn _i )/ £ ]£+1 ) e 

Tj e , etc. Clearly, if we denote j^ the projection onto V { e , then for n 

large enough, 

Tr(P- (n) ^ <g> ffj+j ® • • • ® a i+n _i) > 1 - 5 2 . (B.3.20) 



Moreover, if |^ ifcl ) <g> |-0*+i,fc 2 > ® IV'i+n-i.O e V ie then 



1 n 1 L— 1 

- ^ log A i+i _i, fe . + - ^ S(cxi 

7 = 1 i=0 



< 



(B.3.21) 



Let m be so large that (IB.3.201) and (|B.3.21I) hold for n>n x . 



We need a similar result for the average entropy 



L-l J 



5 = t5Z5Z^' s ^)- 



i=0 j=l 



(B.3.22) 



Lemma B.3.4. F/jc i e {0, . . . , L — 1}. Given a sequence j = (ji, . . . ,j n ) with 
1 < jr < J(i + t — 1), to -Pj ™ £e ?/ze projection onto the subspace of 1C® n 
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spanned by the eigenvectors of a}™ = Oi ;J1 <8> • • • <E> Ci+ n -i,j n with eigenvalues 



S n ) _ rr™ 



A j,k - llr=l A 



i+r— l,j r ,k, 



such that 



^logAg + S 



e 



(B.3.23) 



For any <5 > ?/zere e^/5?5 n 2 G N such that for n > n 2 , 



E(TV(ag ) Fi?))>l-^ 



»,j »,j 



(B.3.24) 



(n)- 



where E denotes the expectation with respect to the probability distribution {p K j } 
on ?«e states p,- . 



Proof. Define i.i.d. random variables X 1 , . . . , X n with distribution given by 



Prob (X r — \i+ r -l,j,k) — Pi+r-l,j \+r-l,j,k- 



(B.3.25) 



By the Weak Law of Large Numbers, 



L-l J 



-^logX r ->■ -^^^^^A^-fclogA^-fc 

r=l i=0 7 = 1 fc 



j=0 j = l k 

L J(i) 



lEEft 5 ^)- 5 ' c 8 - 3 - 26 ) 



= 1 3=1 



It follows that there exists n 2 such that for n > n 2 , the typical set T> e ' of sequences 
of pairs ((j 1? fci), . . . , (j n , fc n )) such that 



1 " 
-y^logA i+r _i ijrife? . + S 

r=l 



< 



(B.3.27) 
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satisfies 



n 
F ( T ff) = E t[PSr**r-Vr*r > 1 " ^ (B.3.28) 



((i 1 ,fc 1 ),...,O n ,fe n ))eT E (n)r - 1 



Obviously, 



^ } > E it^wSi (B3 - 29) 






and 



E (Tr (a$ n) P$)) > P (iff) > 1 - <5 2 . (B.3.30) 



D 



The remainder of the proof is essentially the same as that in Q. Let iV = N( 



n 



be the maximal number of product states p^ , . . . , p^ on "H 8 " 1 (each of which is 
a tensor product of states in the maximising ensemble {pj, Pj}j =1 ) for which there 
exist positive operators E[ n) , . . . , E^ ] on K,® ML2 <g> /C 0n such that 

(i) 4° = Ef=i n* ® 4? and Ef =1 4? < *f J and 
(ii) 1 X) Tr [ (^ ® 4?) ^ ML2+U) { uiML2) ® ^"O] > X " e and 

i=l 

( iii) I ^ Tr [ (n, ® E$) $f /L2+n) (u/ ML2 > ® p®")] < 2- n W^ 



*=i 



where p = Y/ j= xPjPr 

For each 2 = 1, . . . , M and j = (ji, ■ ■ ■ ,j n ) such that 1 < j r < J, we define, as 
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before, 



V; 



N \ 1/2 / N \ l / 2 

r( n ) _ [ p( n ) _ ST^ jjj(n) I p{n)p{n)p(n) I p(n) _ ST^ pj(n) 



fc=l 



fe=l 



(B.3.31) 



and we put 



ju 



vf^^n^ 



(n) 



j=l 



(B.3.32) 



Clearly ^<P^-Ef =1 ^ 



K- is a candidate for an additional measurement operator, E^' +l , for Bob with 
corresponding input state p^ +l = p™ = Pj x <8> Pj 2 . . . ® Pj„. Clearly, the condition 



(i), given above, is satisfied and we also have 



Lemma B.3.5. 



i £ Tr [(n, ® I#>) $f i2+ri) ( w c»"») ® p 5 

8=1 



)]] < 2- n I cr W-3 6 ]. (B.3.33) 



Proo/ Put Q n>i = Yl,k=i^ki- Note that Q Uji commutes with P i . Using the 
fact that ^ (n) $J n) (p®")^ (n) < 2- n [xEf=i5(ffi)-i£] by (|B.3.21I> . we have, denoting 



^V^) 



= Tr [jW(?W - Q^pMpMp^ipM - Q n ,) 1/2 

= Tr [^)ij|»)pW(pW - Q^PgiPW - Q„,) 1/2 

< 2 -n[iEf =1 ^)-| e]Tr j ( pW _ Q niJ )V2pW (j pW _ Qnij) i/2 

< 2^ n ^ E -i 5(<f!) -2 el Tr (P- (n) ) < 2 _n ^Sfei5(ffiM-| e ] j (b.3.34) 



where, in the last inequality, we used the standard upper bound on the dimension of 
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the typical subspace, Tt(P^') < 2 n ^- s+ 4. e \ which follows from Lemma IB .3 .41 □ 



By maximality of N it now follows that the condition (ii) above cannot hold, that 



is, 



1 
I 



£ Tr [ (IT ® V$) $f i2+n) (J ML ^ ® pf 



< 1 - e (B.3.35) 



i=\ 



for every j, and this yields the following 



Corollary 1. 



^E (Tr [ fa ® V^ } ) $f i2+n) (u/ MLa ) ® pf 



< 1 - e. (B.3.36) 



8=1 



We also need the following lemma 
Lemma B.3.6. For all rj > 5 2 + 35, 

1 



i=i 



7 E^ (fi* ® p i n) p^pn ®\ ML2+n) U ML2) ® ^ (n V- 



(n)< 



> 1-V 

(B.3.37) 



i/n w /arge enough. 



Proof. This is proved as in Q. 



Lemma B.3.7. Assume rj < |e anJ wn'te 



D 



A' 



Qn,i = E ^ 



(n) 

k,i " 



(B.3.38) 



fe=i 
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Then for n large enough, 



7 E E ( Tr [ fa ® Qn,i) $! ML2+n) (^ (ML2) ® pf 



> 77". (B.3.39) 



i=i 



Proof. This follows as before from the previous lemma using the Cauchy-Schwarz 
inequality. □ 

It now follows that for n large enough, N{n) > (rj 1 ) 2 2 n ' c '(*)~2 e l. We take the fol- 
lowing states as codewords, 



Pk 



(ML 2+ n) = uiM L 2 ) ^ >) _ 



(B.3.40) 



For n sufficiently large we then have 



N, 



n+ML 2 



N(n) > (r/) 2 2 n[C ^~^ e] > 2^ ML2+n ^ c ^- 



(B.3.41) 



To complete the proof we need to show that the set Ej^' satisfies (IB.3.11) . But this 
follows immediately from condition (ii), 



Tr 



Q(ML 2 +n) ( (ML 2 +n)\ ^(nj 

l -Y^ Tr [ $f (Mi2+n) (j ML * ® pf } ) 



L 

i=0 

> 1-e. 



EL <g> E, 



(n) 

i.k 



(B.3.42) 
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B.3. THE QUANTUM FEINSTEIN LEMMA 



□ 



We have now provided a proof for the direct part of the channel coding theorem 
for the periodic quantum channel introduced in Chapter 4. Note that in Chapter 
5 we show that the strong converse theorem does not in fact hold for the periodic 
quantum channel. 
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